Python API example returning error

valtteri.laine · January 23, 2017, 1:36pm

Hello!

I have been trying to get python API working and as default (copypaste from the Python example) it returns following error:

ValueError: No ‘:’ found when decoding object value

It comes from the line: df = pd.read_json(r.text, orient=‘index’)

r.text is in format: ‘{“1388534400000”:{"[“output”,“kW”]":0.0}, … , “1420066800000”:{"[“output”,“kW”]":0.0}}’

Any ideas on how to fix this?

The obvious work-around would be to get the data as csv and use the corresponding functions for that, however I get errors about too long path names for Windows… (though this is easily fixed)

Some googling on the problem return different solutions:

It is existing error and can be worked around with pandas.DataFrame.to_json and pandas.DataFrame.reset_index , however for this case the JSON should already be in DataFrame to use these tools for fix.
The data is bad / wrong format / something

P.S. I’m not the best programmer so even an obvious solution might be the correct one (newest version of pandas - check)

valtteri.laine · January 24, 2017, 2:35pm

Hey!

These two might be the reason for this problem:

github.com/pandas-dev/pandas

to_json of df with multi-index header, followed by read_json results in incorrectly formatted column headers

opened 08:47PM - 19 Sep 13 UTC

jnorwood

Enhancement IO JSON

cc @Komnomnomnom I'm using a recent anaconda build on Windows, which includes v… 0.12.0 pandas. I'm attempting to use a multi-index header, write it out to a json file, import it and get the same formatted dataframe. The column headers come out stacked vertically instead. import numpy as np import pandas as pd pd.set_option('html',False) index = pd.MultiIndex(levels=[['foo', 'bar', 'baz', 'qux'], ['one', 'two', 'three']], labels=[[0, 0, 0, 1, 1, 2, 2, 3, 3, 3], [0, 1, 2, 0, 1, 1, 2, 0, 1, 2]], names=['foo', 'bar']) df = pd.DataFrame(np.random.randn(10, 3), index=index, columns=['A', 'B', 'C']) dft = df.transpose() dft dft.to_json('D:\xxmt.json',orient='split') dftj=pd.read_json('D:\xxmt.json',orient='split') dftj

A follow-up question: can the raw data (direct, diffuse, temp) be included into API request?
If so, how?

valtteri.laine · January 30, 2017, 2:17pm

Next challenge:

File “C:\Data\Solar\gsee-master\gsee\trigon.py”, line 183, in aperture_irradiance
sunrise_set_times = sun_rise_set_times(direct.index, coords)

File “C:\Data\Solar\gsee-master\gsee\trigon.py”, line 56, in sun_rise_set_times
dtindex = pd.DatetimeIndex(datetime_index.to_series().map(pd.Timestamp.date).unique())

File “C:\Users\f0386329\AppData\Local\Continuum\Anaconda3\lib\site-packages\pandas\core\series.py”, line 2177, in map
new_values = map_f(values, arg)

File “pandas\src\inference.pyx”, line 1207, in pandas.lib.map_infer (pandas\lib.c:66124)

TypeError: descriptor ‘date’ requires a ‘datetime.datetime’ object but received a ‘int’

Trigon.py says: # Ensure datetime_index is daily

So if I understood correctly, I should have variable with name datetime_index as a series/dataframe in datetime-type. Is this supposed to be linked to the dataframe gsee.pv.run_plant_model demands (df.datetime_index)? Is this supposed to be in YEAR-MONTH-DAY format with no hours? Does it have 365 values or 8760 (ie date for each pandas dataframe row)?

valtteri.laine · February 1, 2017, 2:22pm

The latest was solved by replacing the default index [0…8760] with the datetime-index. In practice by telling the function reading csv which line is index and adding date parsing.

valtteri.laine · February 6, 2017, 8:19am

Problem solved.
DataFrame inserted to run_plant_model should be:
index column: years-months-days XX:XX:XX in datetime
skip headers (the unit row was causing errors)
and the data (did not test how it reacts to NaN-values, I substituted them with 0s)

sebastian · March 31, 2017, 10:49am

Hi Valtteri.Iaine,

sorry I am new to python. However, it seems you have solved your problems. Could you upload your solution please?

Thank you
Sebastian

valtteri.laine · March 31, 2017, 11:46am

Crude fix for solar API:
change args:format to csv

after r =… -line (replacing the last 2 lines):
r = r.text.split(’\n’)
splitted1 = []
splitted2 = []
for i in range(2,8762):
x = r[i].split(’,’)
splitted1.append(x[0])
splitted2.append(x[1])
df = pd.DataFrame(index=splitted1, columns=[‘output’])
df[‘output’] = splitted2
#DataFrame to csv-file
#df.to_csv(path_or_buf=,columns=[‘output’],index_label=‘TimeUTC’)
#print(df)

edit: autoformat sucks, just copypaste the code and indent 3 lines below for-loop, also comment with # the 3 last lines

yusak.tanoto · May 2, 2017, 3:49am

Hi Valtteri,

I use your suggestion and made the script like the following, but it comes with syntax error:

import requests
import pandas as pd
token = ‘xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx’
api_base = ‘https://www.renewables.ninja/api/v1/’
s = requests.session()
s.headers = {‘Authorization’:‘xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx’+token}
url = api_base + ‘data/pv’
args = {
… ‘lat’:34.125,
… ‘lon’:39.814,
… ‘date_from’: ‘2014-01-01’,
… ‘date_to’: ‘2014-12-31’,
… ‘dataset’: ‘merra2’,
… ‘capacity’:1.0,
… ‘system_loss’: 10,
… ‘tracking’: 0,
… ‘tilt’: 34,
… ‘azim’: 180,
… ‘format’: ‘csv’
… }
r = s.get(url, params = args)
r = r.text.split(‘\n’)
splitted1 =
splitted2 =
for i in range (2,8760):
… x = r[i].split(‘,’)
… splitted1.append(x[0])
… splitted2.append(x[1])
… df = pd.DataFrame(index = splitted1, columns = [‘output’])
File “”, line 5
df = pd.DataFrame(index = splitted1, columns = [‘output’])
^
SyntaxError: invalid syntax

Could you please help me to resolve the problem, because I am not a good programmer, and also, if I need to download automatically data for a range of certain Lat and certain Lon (for example for an island), what should I write in my script?

Thank you very much

valtteri.laine · May 2, 2017, 5:30am

SyntaxError means that you must change how you formulate the code, the little arrow often pointing where the error is. In this case I think it might be that you have:
"… df = pd.DataFrame(index = splitted1, columns = [‘output’])"
inside a for-loop. Try removing the indentation before df = … (also before df[‘output’] = …).

If you want data from multiple locations I’d do (after url = api_base + ‘data/pv’ -line) (dots signify indentation):
save_list = []
lat_list = [lat1, lat2, lat3,…,latn]
lon_list = [lon1, lon2, lon3,…,lonn]
if len(lat_list) == len(lon_list):
…print(“Coords OK”)
for j in range (0, len(lat_list)):
…args = {
… ‘lat’:lat_list[j],
… ‘lon’:lon_list[j],

The rest is same until lines:
df = pd.DataFrame(index=splitted1, columns=[‘output’])
df[‘output’] = splitted2

Replace the above lines with:
…save_list[j] = splitted2

And to the end add something like (not inside either of for-loops):
df = pd.DataFrame(data=save_list, index=splitted1)

yusak.tanoto · May 2, 2017, 12:38pm

Hi Valtteri,

Thank you very much for your suggestion to exclude: df = pd.DataFrame(index = splitted1, columns = [‘output’])
from the indentation. It is working now (for the case 1 coordinate).

However, when I try to change the year to 2013 or 2015, using the same token or different token, the result comes with error as followings: (despite by using token, we should be able to get 4 years data)

import requests
import pandas as pd
token = ‘xxxxxxxxxxxxxxxxxxxxxxxxxxxxx’
api_base = ‘https://www.renewables.ninja/api/v1/’
s = requests.session()
s.headers = {‘Authorization’: ‘xxxxxxxxxxxxxxxxxxxxxxxxxxxxx ’ + token}
url = api_base + ‘data/pv’
args = {
… ‘lat’: -7.250,
… ‘lon’: 112.757,
… ‘date_from’: ‘2015-01-01’,
… ‘date_to’: ‘2015-12-31’,
… ‘dataset’: ‘merra2’,
… ‘capacity’: 1.0,
… ‘system_loss’: 10,
… ‘tracking’: 0,
… ‘tilt’: 7,
… ‘azim’: 180,
… ‘format’: ‘csv’
… }
r = s.get(url, params = args)
r = r.text.split(’\n’)
splitted1 =
splitted2 =
for i in range(2,8762):
… x = r[i].split(‘,’)
… splitted1.append(x[0])
… splitted2.append(x[1])
…
Traceback (most recent call last):
File “”, line 2, in
IndexError: list index out of range
# then when I continue with:
df = pd.DataFrame(index=splitted1, columns=[‘output’])
df[‘output’] = splitted2
print (df)
Empty DataFrame
Columns: [output]
Index: []

What do you think?

Also, what I mean with multiple location is I observed the resolution used in renewables.ninja is 0.01 degree. Meaning if you put let say Lat 7.000 then you can get different output for Lat 7.010.

So if I want to get let say data for Lat 7.000 to 8.000 with interval 0.01, same case with the Longitude, from Lon 105.000 to 110.000 with interval 0.01 deg, how would you suggest in the script, so that for a year, we can get that many output related to the coordinate changes.

Thank you so much!

valtteri.laine · May 2, 2017, 1:24pm

Date range information says that due to server limitations, simulations can only be run for the year 2014. “Contact us for years before 2014.”

The earlier code same except replace lat_list and lon_list with:
lat_list = []
lon_list = []
lat_start = X1
lat_stop = X2
lat_step = X3
lon_start = Y1
lon_stop = Y2
lon_step = Y3
lat = lat_start - lat_step
while lat <= lat_stop:
…lat += lat_step
…lon = lon_start
…while lon <= lon_stop:
…lon_list.append(lon)
…lat_list.append(lat)
…lon += lon_step

EDIT: fixed

yusak.tanoto · May 3, 2017, 4:49am

Hi Valtteri,

Thanks for the script. I have gone through it but I encountered an error message when it is almost finished:

import requests
import pandas as pd
token = ‘xxxxxxxxxxxxxxxxxxxx’
api_base = ‘https://www.renewables.ninja/api/v1/’
s = requests.session()
s.headers = {‘Authorization’: ‘xxxxxxxxxxxxxxxxxxxx ’ + token}
url = api_base + ‘data/pv’
save_list =
lat_list =
lon_list =
lat_start = 7.000
lat_stop = 8.000
lat_step = 0.01
lon_start = 105.000
lon_stop = 110.000
lon_step = 0.01
lat = lat_start - lat_step
while lat <= lat_stop:
… lat += lat_step
… lon = lon_start
… while lon <= lon_stop:
… lon_list.append(lon)
… lat_list.append(lat)
… lon += lon_step
…
if len(lat_list) == len(lon_list):
… print(“Coords OK”)
…
Coords OK
for j in range (0, len(lat_list)):
… args = {
… ‘lat’: lat_list[j],
… ‘lon’: lon_list[j],
… ‘date_from’: ‘2014-01-01’,
… ‘date_to’: ‘2014-12-31’,
… ‘dataset’: ‘merra2’,
… ‘capacity’: 1.0,
… ‘system_loss’: 10,
… ‘tracking’: 0,
… ‘tilt’: 7.5,
… ‘azim’: 180,
… ‘format’: ‘csv’
… }
…
r = s.get(url, params = args)
r = r.text.split(’\n’)
splitted1 =
splitted2 =
for i in range(2,8762):
… x = r[i].split(‘,’)
… splitted1.append(x[0])
… splitted2.append(x[1])
… save_list[j] = splitted2
…
Traceback (most recent call last):
File “”, line 5, in
IndexError: list assignment index out of range

I don’t know what is wrong.

Also, if we move our latitude, the ‘tilt’ should be changed as well, right? For example, if the latitude is 6 or change to 7, then how to incorporate the tilt automatically in the script.

Thank you so much!

yusak.tanoto · May 3, 2017, 6:51am

Hi Valtteri,

I mean I got error in the following:

C:\ProgramData\Anaconda3\python.exe C:/Users/Yusak/Documents/renewables.ninja/multiple.py
Coords OK
Traceback (most recent call last):
File “C:/Users/Yusak/Documents/renewables.ninja/multiple.py”, line 51, in
save_list[j] = splitted2
IndexError: list assignment index out of range

So it is an error in: save_list[j] = splitted2

valtteri.laine · May 3, 2017, 10:49am

r = s.get(url, params = args)
r = r.text.split(‘\n’)
splitted1 =
splitted2 =
for i in range(2,8762):
… x = r[i].split(‘,’)
… splitted1.append(x[0])
… splitted2.append(x[1])
… save_list[j] = splitted2

These should be in the for-j loop and save_list outside and after for-i loop. You will have two loops for loops for getting the data: j-loop goes over the list of all arguments you want data from (lat, lon, tilt lists) and the i-loop will strip the data. J-loop is the amount of solutions you have, so the save_list should be an array of those.

If you want to incorporate tilt as a changing variable, then:
to args-section add …‘tilt’: = tilt_list[j],
where you create the lists for lat and lon, insert:
tilt_list =
tilt_start = Z1
in the while-loop:
tilt = lat/lon * / + - Z1 # whatever equation you want your tilt to be linked to (I don’t understand why should the tilt change as well unless you are using the gsee/pv.optimal_tilt(lat))
tilt_list.append(tilt)

And to check you did everything correctly:
if len(lat_list) == len(lon_list) and len(lat_list) == len(tilt_list):
… print(“Coords and tilt-list OK”)

Would be easier if you pasted the whole code instead of error messages. Seems like you have lot of indents gone wrong.

yusak.tanoto · May 4, 2017, 11:58pm

Hi Valtteri,

Thank you so much. It is worked now.
By the way, can you share how you get the other data, I mean irradiation and temperature for corresponding location?

valtteri.laine · May 9, 2017, 12:55pm

Hey!

I download it manually for a site from the renewables ninja main site. My use for the data is to optimize a site-specific energy system, so the radiation and temperature data stays the same. And the calculation goes through every possible angle, azimuth and capacity to find most cost-efficient solution for that site (calculations require hourly output for the PV which is why I’m using this nice tool created by the renewable ninjas). And since the optimization might go through over 20k solutions per site, making 20k requests from a ‘free’(cost= academic citation) server is not realistic. Also the server delay would add great amount of time to the optimization.

This however has one limitation: I have to scale the radiation data back to the original, so it can be input back to the gsee.pv.run_plant_model. It is done in two steps:

Set tilt/angle to 0 (so that diffuse component is at its max value and doesn’t have to be scaled again)
df[‘global_horizontal’] = df[‘diffuse’] + ((60 * df[‘direct’] * np.cos(df_b[‘sun_zenith’])) / (np.cos(np.arccos(np.sin(df_b[‘sun_alt’]))) * df_b[‘duration’]))

where df refers to data read from the file, and df_b is a dataframe of sun positions etc. (gsee.trigon.sun_angles). Avg. error from this is -0.01%, which is due to decimal rounding.

I would be very interested in a solution which would replace this manual step of downloading that site-specific data, but instead getting the data automatically through API.

Has anyone tried to add extra arguments to API request?
Like ‘rawdata’ : True or ‘include_raw_data’ : ‘True’ (in any format)

Staffell/Pfenninger, is this possible? Or could such addition be made?

-Valtteri