Title: | Green Finance and Environmental Risk |
---|---|
Description: | Focuses on data collecting, analyzing and visualization in green finance and environmental risk research and analysis. Main function includes environmental data collecting from official websites such as MEP (Ministry of Environmental Protection of China, <https://www.mee.gov.cn>), water related projects identification and environmental data visualization. |
Authors: | Yuanchao Xu [aut, cre] |
Maintainer: | Yuanchao Xu <[email protected]> |
License: | GPL-2 |
Version: | 0.1.12 |
Built: | 2025-03-07 05:51:13 UTC |
Source: | https://github.com/yuanchao-xu/gfer |
private function for check the http status
checkHttpStatus(ret)
checkHttpStatus(ret)
ret |
the response obj returned by httr package |
return nothing , but if it finds some error , it stop the script
Xuehui YANG (2016). rstatscn: R Interface for China National Data. R package version 1.1.1. https://CRAN.R-project.org/package=rstatscn
Matrix showing complicated management of China's Water Resource
cm
cm
A data frame with 13 rows and 11 variables:
...
private function to convert the returned jason data to a dataframe
dataJson2df(rawObj, rowcode, colcode)
dataJson2df(rawObj, rowcode, colcode)
rawObj |
the fromJSON output |
rowcode |
rowcode in the data frame |
colcode |
colcode in the data frame |
the contructed data frame
Xuehui YANG (2016). rstatscn: R Interface for China National Data. R package version 1.1.1. https://CRAN.R-project.org/package=rstatscn
Table about GDP mix of China provinces in 2015
GDPmix
GDPmix
A data frame with 11 rows and 7 variables:
...
private function for constructing the query parameter for dfwds
genDfwds(wdcode, valuecode)
genDfwds(wdcode, valuecode)
wdcode |
string value , one of c("zb","sj","reg") |
valuecode |
string value , following is the table for available valuecode zb: the valudecode can be gotten by statscnQueryZb() function sj: the valudecode can be "2014" for nd db, "2014C" for jd db. reg: the valudecode is the region code fetched by statscnRegions(dbcode) function |
return the queyr string for the http request
Xuehui YANG (2016). rstatscn: R Interface for China National Data. R package version 1.1.1. https://CRAN.R-project.org/package=rstatscn
get CSR rating from a website
getCSRRating(startPage, endPage, year = 2015, proxy = FALSE)
getCSRRating(startPage, endPage, year = 2015, proxy = FALSE)
startPage |
on Which page you want to start, default is 1 |
endPage |
On which page you want to stop scrapping |
year |
In which year you want the rank |
proxy |
whether use the proxy, default is FALSE |
Get CSR ratings and reports of different companies from http://stockdata.stock.hexun.com/zrbg/
A table of CSR ratings collected from your input page
www.hexun.com
## Not run: # get first two pages of CSR ratings in 2015 getCSRRating(1,3) ## End(Not run)
## Not run: # get first two pages of CSR ratings in 2015 getCSRRating(1,3) ## End(Not run)
get CSR rating from a website for a unit page
getCSRRating_unit(page, date, proxy = NULL)
getCSRRating_unit(page, date, proxy = NULL)
page |
on Which page you want to scrap |
date |
represents the date is until which date, usually it's the last day of a year e.g., "2015-12-31" for the date of year 2015, "2014-12-31" for the date of year 2014 |
proxy |
whether use the proxy, default is FALSE |
Get CSR ratings and reports of different companies from http://stockdata.stock.hexun.com/zrbg/
A table of CSR ratings collected from your input page
www.hexun.com
get a company's EN names
getENNames(tickers)
getENNames(tickers)
tickers |
ticker/sympol of a company, TICKERS MUST BE CHARACTERs, '006027' INSTEAD OF '6027' |
Data comes from hexun.com
A data table with companies' EN names
http://hexun.com
## Not run: getENNames(601857) ## End(Not run)
## Not run: getENNames(601857) ## End(Not run)
get a company's English name
getENNames_unit(ticker)
getENNames_unit(ticker)
ticker |
ticker/sympol of a company, MUST BE A CHARACTER, '006027' INSTEAD OF '6027' |
Data comes from hexun.com
A data table with companies' EN names
http://hexun.com
get a company's listed location
getExchange(tickers)
getExchange(tickers)
tickers |
ticker/sympol of a company, TICKERS MUST BE CHARACTERs, '006027' INSTEAD OF '6027' |
Data comes from www.finance.sina.com.cn
A data table with a listed companies' ticker, security name and listed exchange location
www.finance.sina.com.cn
## Not run: getExchange('600601') getExchange(c('00005', '00001')) ## End(Not run)
## Not run: getExchange('600601') getExchange(c('00005', '00001')) ## End(Not run)
get a company's historical market cap, data comes from NetEase
getHisMktCap(tickers, date1, date2)
getHisMktCap(tickers, date1, date2)
tickers |
ticker/sympol of a company, TICKERS MUST BE CHARACTERs, '006027' INSTEAD OF '6027' |
date1 |
starting date, in the following format "20160101", means Jan 1st of 2016 |
date2 |
ending date, in the following format "20160101", if you only want one day's data, just set starting date and ending date the same day |
The input date interval should have at least one work day Data comes from www.money.163.com
A data table with companies total capitalization and market capitalization
www.money.163.com
## Not run: getHisMktCap(601857, '20161202', '20161203') ## End(Not run)
## Not run: getHisMktCap(601857, '20161202', '20161203') ## End(Not run)
get a company's historical market cap, data comes from NetEase
getHisMktCap_unit(ticker, date1, date2)
getHisMktCap_unit(ticker, date1, date2)
ticker |
ticker/sympol of a company, MUST BE A CHARACTER, '006027' INSTEAD OF '6027' |
date1 |
starting date, in the following format "20160101", means Jan 1st of 2016 |
date2 |
ending date, in the following format "20160101", if you only want one day's data, just set starting date and ending date the same day |
Data comes from www.money.163.com
A data table with companies total capitalization and market capitalization
www.money.163.com
get a company's market cap, data comes from NetEase
getIndex(tickers, indexData)
getIndex(tickers, indexData)
tickers |
ticker/sympol of a company, MUST BE A CHARACTER, e.g., input "006600" instead of 006600 The tickers have to be FULL AND EXACT, e.g., for Shanghai exchange and Shenzhen exchange, the input must have 6 digits, and for HK exchange, it must have 5 digits. the '0' in the beginning cannot be left out. |
indexData |
the index information, before running getIndex, indexData needs to be loaded using |
Data comes from www.finance.sina.com.cn and www.etnet.com.hk
A data table with companies and which index they are included
www.finance.sina.com.cn www.etnet.com.hk
## Not run: indexData <- getIndexData() getIndex(600601, indexData) ## End(Not run)
## Not run: indexData <- getIndexData() getIndex(600601, indexData) ## End(Not run)
get a company's market cap, data comes from NetEase
getIndexConstnt(indexPool)
getIndexConstnt(indexPool)
indexPool |
a pool of different index, special format for gfer |
A data table with companies total capitalization and market capitalization
get index information Currently include CSI 100, SSE 50, CSI 300, SSE Central SOEs 50, HSI, HSCEI
getIndexData()
getIndexData()
a data table containing index information
get National Bureau of Statistics data
getNBS(indicator, start, end)
getNBS(indicator, start, end)
indicator |
of which data is fetched, indicator includes 'GDP', 'water resources', 'water use' and 'wastewater', etc. |
start |
starting year of data wanted |
end |
end year of data wanted, make sure your input end year exists in the NBS website |
no return
Xuehui YANG (2016). rstatscn: R Interface for China National Data. R package version 1.1.1. https://CRAN.R-project.org/package=rstatscn
get PPP list from an official website
getPPPList(startPage = 1, endPage, proxy = FALSE)
getPPPList(startPage = 1, endPage, proxy = FALSE)
startPage |
on Which page you want to start, default is 1 |
endPage |
On which page you want to stop scrapping |
proxy |
whether proxy will be used, default is FALSE |
Get PPP list from the Ministry of Finance of China (http://www.cpppc.org:8082/efmisweb/ppp/projectLibrary/toPPPList.do?projName=), to view the listed projects in the PPP library.
A table of PPP projects collected from your input page
www.cpppc.org
## Not run: #scrape the first two pages getPPPList(1,3) ## End(Not run)
## Not run: #scrape the first two pages getPPPList(1,3) ## End(Not run)
get PPP list from a single page
getPPPList_unit(page, proxy = NULL)
getPPPList_unit(page, proxy = NULL)
page |
The page number |
proxy |
if you wnat to use a proxy to avoid blocking, you can input a proxy, otherwise leave it blank. |
A table of PPP projects collected from your input page
Get proxy pool from free proxy provider
getProxy()
getProxy()
Extract proxies from http://www.free-proxy-list.net/, in case of the risk of being blocked by the scrapped website
The sum of x
and y
.
www.free-proxy-list.net
Get information from Shanghai Exchange and Shenzhen Exchange. This will only get stock information in Shanghai Exchange and Shenzhen Exchange Including stocker ticker, stock name and company full name. Data comes from China Merchants Bank
getStockList()
getStockList()
http://info.cmbchina.com/Stock/Single/
It can also be a way to test if a company is listed NOTE: If a company is listed in multiple exchange, then it needs double check, the programe only chooses ticker from random exchange
getTickers(corpNames)
getTickers(corpNames)
corpNames |
Full name of a company, should be full name |
Data comes from www.cninfo.com.cn/
A data table with companies stock name and stock ticker
www.cninfo.com.cn
It can also be a way to test if a company is listed
getTickers_unit(corpName)
getTickers_unit(corpName)
corpName |
Full name of a company |
Data comes from www.cninfo.com.cn/
A data table with companies stock name and stock ticker
Get NBS data from google sheet by shared link. Default link is provided by gfer, you can also create your own google sheet of GDP. NOTE: The 'link sharing on' of the sheet must be ticked in order to read
getWaternomicsData_goog()
getWaternomicsData_goog()
Get NBS data from NBS website.
getWaternomicsData_NBS(start, end)
getWaternomicsData_NBS(start, end)
start |
starting year of data wanted |
end |
end year of data wanted, make sure your input end year exists in the NBS website |
get PPP list from a single page
getWaterQ_MEP_all(year, week, station1, station2, proxy = FALSE)
getWaterQ_MEP_all(year, week, station1, station2, proxy = FALSE)
year |
In which year you would like to scrape |
week |
In which week you would like to scrape, can be an array, like 3:5 |
station1 |
the start station index on the page |
station2 |
the end station index on the page |
proxy |
Whether to use proxy, default is FALSE |
Get monitoring data of different stations from Minitsry of Environmental Protection of China (http://datacenter.mep.gov.cn/report/getCountGraph.do?type=runQianWater). Using this function you will get data of all the stations. Since the number of stations vary with time, using this function, you have to make sure that within the period you are scrapping, the number of stations keep consistant.
http://datacenter.mee.gov.cn/report/getCountGraph.do?type=runQianWater
## Not run: # get data from 1st station to 5th station of the 3rd week of 2016 a <- getWaterQ_MEP_all(2016, 3, 1, 5) ## End(Not run)
## Not run: # get data from 1st station to 5th station of the 3rd week of 2016 a <- getWaterQ_MEP_all(2016, 3, 1, 5) ## End(Not run)
get PPP list from a single page
getWaterQ_MEP_all_unit(year, week, station1, station2, proxy = NULL)
getWaterQ_MEP_all_unit(year, week, station1, station2, proxy = NULL)
year |
In which year you would like to scrape |
week |
In which week you would like to scrape |
station1 |
the start station index on the page |
station2 |
the end station index on the page |
proxy |
if you wnat to use a proxy to avoid blocking, you can input a proxy, otherwise leave it blank. |
A table of PPP projects collected from your input page
http://datacenter.mee.gov.cn/report/getCountGraph.do?type=runQianWater
Check if a company is listed in Chinese stock market
is.listed(corpList, stockList)
is.listed(corpList, stockList)
corpList |
company list you want to check if listed, should be a dataframe |
stockList |
Result from |
http://info.cmbchina.com/Stock/Single/
if 'Summation of cell padding on y-direction are larger than the height of the cells' appears, just enlarge the xlim or ylim accordingly
plotChord( data, t = FALSE, ifsep = TRUE, trans = 0.3, highlight = NULL, xlim = c(-1, 1), ylim = c(-1, 1) )
plotChord( data, t = FALSE, ifsep = TRUE, trans = 0.3, highlight = NULL, xlim = c(-1, 1), ylim = c(-1, 1) )
data |
a dataframe showing different management intersections. See the data frame in the example |
t |
is transpose the dataframe, by default, lines flow from row to column, if t == TRUE, lines will flow from columns to rows. Once transposed, |
ifsep |
if separate row and col categories in the chart, default is TRUE |
trans |
transparency of the chart's lines, default is 0.3 |
highlight |
a string or string array of highlighted items, MUST be selected from first column (which represents names) or colnames. if highlight has more than 2 items, they should belong to same category, either colnames, or names. One name and one column name is not allowed. |
xlim |
x limit of the chart, default is c(-1, 1) |
ylim |
y limte of the chart, default is c(-1, 1) |
plot scatter pie chart for multidimension analysis, such as waternomics. This plot can provide information about water use/wastewater of each provinces and GDP mix of each provinces, see examples.
## Not run: plotChord(cm) plotChord(cm, t = T) plotChord(cm, highlight = 'MEP') plotChord(cm, highlight = 'Investment') ## End(Not run)
## Not run: plotChord(cm) plotChord(cm, t = T) plotChord(cm, highlight = 'MEP') plotChord(cm, highlight = 'Investment') ## End(Not run)
plot scatter pie chart for multidimension analysis, such as waternomics. This plot can provide information about water use/wastewater of each provinces and GDP mix of each provinces, see examples.
plotScatterPie( data, pieRange, pieColor = NULL, xmeanLine = TRUE, ymeanLine = TRUE, label_on = TRUE, output = FALSE )
plotScatterPie( data, pieRange, pieColor = NULL, xmeanLine = TRUE, ymeanLine = TRUE, label_on = TRUE, output = FALSE )
data |
a dataframe with colnames x, y, r, label, these four names must be in colnames. |
pieRange |
define which column to which column to be presented by pie chart, see examples |
pieColor |
color for different colors in pie chart |
xmeanLine |
if plot x mean line |
ymeanLine |
if plot y mean line |
label_on |
Whether to show label |
output |
if you want an ggplot object as output, default is FALSE |
GDPColor_CWR <- c("#6B8033", "#020303", "#0D77B9") data(GDPmix) # in colnames(GDPmix), there must be x, y, r, label. # but right now, GDPmix has x, y, r, but lacks a label column, let's assign label to province column colnames(GDPmix)[1] <- 'label' ## Not run: plotScatterPie(GDPmix, pieRange = 4:6, pieColor = GDPColor_CWR) ## End(Not run)
GDPColor_CWR <- c("#6B8033", "#020303", "#0D77B9") data(GDPmix) # in colnames(GDPmix), there must be x, y, r, label. # but right now, GDPmix has x, y, r, but lacks a label column, let's assign label to province column colnames(GDPmix)[1] <- 'label' ## Not run: plotScatterPie(GDPmix, pieRange = 4:6, pieColor = GDPColor_CWR) ## End(Not run)
the available dbs in the national db
statscnDbs()
statscnDbs()
a data frame with 2 columns , one is the dbcode, another is the db description
Xuehui YANG (2016). rstatscn: R Interface for China National Data. R package version 1.1.1. https://CRAN.R-project.org/package=rstatscn
## Not run: statscnDbs() ## End(Not run)
## Not run: statscnDbs() ## End(Not run)
the main function for querying the statscn database, it will retrieve the data from specified db and orginize the data in a data frame.
statscnQueryData( zb = "A0201", dbcode = "hgnd", rowcode = "zb", colcode = "sj", moreWd = list(name = NA, value = NA) )
statscnQueryData( zb = "A0201", dbcode = "hgnd", rowcode = "zb", colcode = "sj", moreWd = list(name = NA, value = NA) )
zb |
the zb/category code to be queried |
dbcode |
the db code for querying |
rowcode |
rowcode in the returned data frame |
colcode |
colcode in the returned data frame |
moreWd |
more constraint on the data where the name should be one of c("reg","sj") , which stand for region and sj/time. the valuecode for reg should be the region code queried by statscnRegions() the valuecode for sj should be like '2014' for *nd , '2014C' for *jd , '201405' for *yd. Be noted that , the moreWd name should be different with either rowcode or colcode |
the data frame you are quering
Xuehui YANG (2016). rstatscn: R Interface for China National Data. R package version 1.1.1. https://CRAN.R-project.org/package=rstatscn
## Not run: df <- statscnQueryData('A0201', dbcode = 'hgnd') df <- statscnQueryData('A0201',dbcode = 'fsnd', rowcode = 'zb', colcode = 'sj', moreWd = list(name = 'reg', value = '110000')) ## End(Not run)
## Not run: df <- statscnQueryData('A0201', dbcode = 'hgnd') df <- statscnQueryData('A0201',dbcode = 'fsnd', rowcode = 'zb', colcode = 'sj', moreWd = list(name = 'reg', value = '110000')) ## End(Not run)
fetch the lastN data for the latest query, only affect the number of rows in the returned data. This function can not be used alone , statscnQueryData() has to be called before this function
statscnQueryLastN(n)
statscnQueryLastN(n)
n |
the number of rows to be fetched |
the last n rows data in the latest query
Xuehui YANG (2016). rstatscn: R Interface for China National Data. R package version 1.1.1. https://CRAN.R-project.org/package=rstatscn
## Not run: df=statscnQueryData('A0201',dbcode='hgnd') df2=statscnQueryLastN(20) ## End(Not run)
## Not run: df=statscnQueryData('A0201',dbcode='hgnd') df2=statscnQueryLastN(20) ## End(Not run)
the sub data categories for the zbid category, dbcode need to be specified, where the dbcode can be fetched by function statscnDbs(). In the returned data frame, the column 'isParent' shows if each sub category is leap category or not
statscnQueryZb(zbid = "zb", dbcode = "hgnd")
statscnQueryZb(zbid = "zb", dbcode = "hgnd")
zbid |
the father zb/category id , the root id is 'zb' |
dbcode |
which db will be queried |
the data frame with the sub zbs/categories , if the given zbid is not a Parent zb/category, null list is returned
Xuehui YANG (2016). rstatscn: R Interface for China National Data. R package version 1.1.1. https://CRAN.R-project.org/package=rstatscn
## Not run: statscnQueryZb() statscnQueryZb('A01',dbcode="hgnd") ## End(Not run)
## Not run: statscnQueryZb() statscnQueryZb('A01',dbcode="hgnd") ## End(Not run)
the available regions in the specified db, it is used for query the province, city and country code generally
statscnRegions(dbcode = "fsnd")
statscnRegions(dbcode = "fsnd")
dbcode |
the dbcode should be some province db(fs*) , city db(cs*) or internaltional db(gj*) |
the data frame with all the available region codes and names in the db
Xuehui YANG (2016). rstatscn: R Interface for China National Data. R package version 1.1.1. https://CRAN.R-project.org/package=rstatscn
## Not run: statscnRegions('fsnd') statscnRegions('csnd') statscnRegions('gjnd') ## End(Not run)
## Not run: statscnRegions('fsnd') statscnRegions('csnd') statscnRegions('gjnd') ## End(Not run)
set the rowName prefix in the dataframe
statscnRowNamePrefix(p = "nrow")
statscnRowNamePrefix(p = "nrow")
p |
, how to set the rowname prefix. it is 'nrow' by default , and it is the only supported value currently to unset the row name prefix, call this function with p=NULL |
in case you encounter the following error: Error in 'row.names<-.data.frame'('*tmp*', value = value) : duplicate 'row.names' are not allowed you need to call this function
no return
Xuehui YANG (2016). rstatscn: R Interface for China National Data. R package version 1.1.1. https://CRAN.R-project.org/package=rstatscn