_images/logo_horizontal_highres.png

RdTools Overview

RdTools is an open-source library to support reproducible technical analysis of time series data from photovoltaic energy systems. The library aims to provide best practice analysis routines along with the building blocks for users to tailor their own analyses. Current applications include the evaluation of PV production over several years to obtain rates of performance degradation and soiling loss. They also include the capability to analyze systems for system- and subsystem-level availability. RdTools can handle both high frequency (hourly or better) or low frequency (daily, weekly, etc.) datasets. Best results are obtained with higher frequency data.

Full examples are worked out in the notebooks shown in Examples.

To report issues, contribute code, or suggest improvements to this documentation, visit the RdTools development repository on github.

Degradation and Soiling

Both degradation and soiling analyses are based on normalized yield, similar to performance index. Usually, this is computed at the daily level although other aggregation periods are supported. A typical analysis of soiling and degradation contains the following:

  1. Import and preliminary calculations
  2. Normalize data using a performance metric
  3. Filter data that creates bias
  4. Aggregate data
  5. Analyze aggregated data to estimate the degradation rate and/or soiling loss

Steps 1 and 2 may be accomplished with the clearsky workflow (see the Examples) which can help eliminate problems from irradiance sensor drift.

RdTools workflow diagram

Degradation

The preferred method for degradation rate estimation is the year-on-year (YOY) approach (Jordan 2018), available in degradation.degradation_year_on_year(). The YOY calculation yields in a distribution of degradation rates, the central tendency of which is the most representative of the true degradation. The width of the distribution provides information about the uncertainty in the estimate via a bootstrap calculation. The Examples use the output of degradation.degradation_year_on_year() to visualize the calculation.

RdTools degradation results plot

Two workflows are available for system performance ratio calculation, and illustrated in an example notebook. The sensor-based approach assumes that site irradiance and temperature sensors are calibrated and in good repair. Since this is not always the case, a 'clear-sky' workflow is provided that is based on modeled temperature and irradiance. Note that site irradiance data is still required to identify clear-sky conditions to be analyzed. In many cases, the 'clear-sky' analysis can identify conditions of instrument errors or irradiance sensor drift, such as in the above analysis.

The clear-sky analysis tends to provide less stable results than sensor-based analysis when details such as filtering are changed. We generally recommend that the clear-sky analysis be used as a check on the sensor-based results, rather than as a stand-alone analysis.

Soiling

Soiling can be estimated with the stochastic rate and recovery (SRR) method (Deceglie 2018). This method works well when soiling patterns follow a "sawtooth" pattern, a linear decline followed by a sharp recovery associated with natural or manual cleaning. soiling.soiling_srr() performs the calculation and returns the P50 insolation-weighted soiling ratio, confidence interval, and additional information (soiling_info) which includes a summary of the soiling intervals identified, soiling_info['soiling_interval_summary']. This summary table can, for example, be used to plot a histogram of the identified soiling rates for the dataset.

RdTools soiling results plot

Availability

Evaluating system availability can be confounded by data loss from interrupted datalogger or system communications. RdTools implements two methods (Anderson & Blumenthal 2020) of distinguishing nuisance communication interruptions from true production outages with the availability.AvailabilityAnalysis class. In addition to classifying data outages, it estimates lost production and calculates energy-weighted system availability.

RdTools availability analysis plot

Install RdTools using pip

RdTools can be installed automatically into Python from PyPI using the command line:

pip install rdtools

Alternatively it can be installed manually using the command line:

  1. Download a release (Or to work with a development version, clone or download the rdtools repository).
  2. Navigate to the repository: cd rdtools
  3. Install via pip: pip install .

On some systems installation with pip can fail due to problems installing requirements. If this occurs, the requirements specified in setup.py may need to be separately installed (for example by using conda) before installing rdtools.

For more detailed instructions, see the Developer Notes page.

RdTools currently is tested on Python 3.6+.

Usage and examples

Full workflow examples are found in the notebooks in Examples. The examples are designed to work with python 3.7. For a consistent experience, we recommend installing the packages and versions documented in docs/notebook_requirements.txt. This can be achieved in your environment by first installing RdTools as described above, then running pip install -r docs/notebook_requirements.txt from the base directory.

The following functions are used for degradation and soiling analysis:

import rdtools

The most frequently used functions are:

normalization.normalize_with_expected_power(pv, power_expected, poa_global,
                                            pv_input='power')
  '''
  Inputs: Pandas time series of raw power or energy, expected power, and
     plane of array irradiance.
  Outputs: Pandas time series of normalized energy and POA insolation
  '''
filtering.poa_filter(poa_global); filtering.tcell_filter(temperature_cell);
filtering.clip_filter(power_ac); filtering.normalized_filter(energy_normalized);
filtering.csi_filter(poa_global_measured, poa_global_clearsky);
  '''
  Inputs: Pandas time series of raw data to be filtered.
  Output: Boolean mask where `True` indicates acceptable data
  '''
aggregation.aggregation_insol(energy_normalized, insolation, frequency='D')
  '''
  Inputs: Normalized energy and insolation
  Output: Aggregated data, weighted by the insolation.
  '''
degradation.degradation_year_on_year(energy_normalized)
  '''
  Inputs: Aggregated, normalized, filtered time series data
  Outputs: Tuple: `yoy_rd`: Degradation rate
    `yoy_ci`: Confidence interval `yoy_info`: associated analysis data
  '''
soiling.soiling_srr(energy_normalized_daily, insolation_daily)
  '''
  Inputs: Daily aggregated, normalized, filtered time series data for normalized performance and insolation
  Outputs: Tuple: `sr`: Insolation-weighted soiling ratio
    `sr_ci`: Confidence interval `soiling_info`: associated analysis data
  '''
availability.AvailabilityAnalysis(power_system, power_subsystem,
                                  energy_cumulative, power_expected)
  '''
  Inputs: Pandas time series system and subsystem power and energy data
  Outputs: DataFrame of production loss and availability metrics
  '''

Citing RdTools

The underlying workflow of RdTools has been published in several places. If you use RdTools in a published work, please cite the following as appropriate:

  • D. Jordan, C. Deline, S. Kurtz, G. Kimball, M. Anderson, "Robust PV Degradation Methodology and Application", IEEE Journal of Photovoltaics, 8(2) pp. 525-531, 2018 ‌‌
  • M. G. Deceglie, L. Micheli and M. Muller, "Quantifying Soiling Loss Directly From PV Yield," in IEEE Journal of Photovoltaics, 8(2), pp. 547-551, 2018
  • K. Anderson and R. Blumenthal, "Overcoming Communications Outages in Inverter Downtime Analysis", 2020 IEEE 47th Photovoltaic Specialists Conference (PVSC)." ‌‌
  • RdTools, version x.x.x, https://github.com/NREL/rdtools, https://doi.org/10.5281/zenodo.1210316
    • Be sure to include the version number used in your analysis!

References

Other useful references which may also be consulted for degradation rate methodology include:

  • D. C. Jordan, M. G. Deceglie, S. R. Kurtz, "PV degradation methodology comparison — A basis for a standard", in 43rd IEEE Photovoltaic Specialists Conference, Portland, OR, USA, 2016, DOI: 10.1109/PVSC.2016.7749593.
  • Jordan DC, Kurtz SR, VanSant KT, Newmiller J, Compendium of Photovoltaic Degradation Rates, Progress in Photovoltaics: Research and Application, 2016, 24(7), 978 - 989.
  • D. Jordan, S. Kurtz, PV Degradation Rates – an Analytical Review, Progress in Photovoltaics: Research and Application, 2013, 21(1), 12 - 29.
  • E. Hasselbrink, M. Anderson, Z. Defreitas, M. Mikofski, Y.-C.Shen, S. Caldwell, A. Terao, D. Kavulak, Z. Campeau, D. DeGraaff, "Validation of the PVLife model using 3 million module-years of live site data", 39th IEEE Photovoltaic Specialists Conference, Tampa, FL, USA, 2013, p. 7 – 13, DOI: 10.1109/PVSC.2013.6744087.

Documentation Contents

Examples

This page shows example usage of the RdTools analysis functions.

Degradation and soiling example with clearsky workflow

This jupyter notebook is intended to the RdTools analysis workflow. In addition, the notebook demonstrates the effects of changes in the workflow. For a consistent experience, we recommend installing the specific versions of packages used to develop this notebook. This can be achieved in your environment by running pip install -r requirements.txt followed by pip install -r docs/notebook_requirements.txt from the base directory. (RdTools must also be separately installed.) These environments and examples are tested with Python 3.7.

The calculations consist of several steps illustrated here:

  1. Import and preliminary calculations

  2. Normalize data using a performance metric

  3. Filter data that creates bias

  4. Aggregate data

  5. Analyze aggregated data to estimate the degradation rate

  6. Analyze aggregated data to estimate the soiling loss

After demonstrating these steps using sensor data, a modified version of the workflow is illustrated using modeled clear sky irradiance and temperature. The results from the two methods are compared at the end.

This notebook works with data from the NREL PVDAQ [4] NREL x-Si #1 system. Note that because this system does not experience significant soiling, the dataset contains a synthesized soiling signal for use in the soiling section of the example. This notebook automatically downloads and locally caches the dataset used in this example. The data can also be found on the DuraMAT Datahub (https://datahub.duramat.org/dataset/pvdaq-time-series-with-soiling-signal).

[1]:
from datetime import timedelta
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
import pvlib
import rdtools
%matplotlib inline
[2]:
#Update the style of plots
import matplotlib
matplotlib.rcParams.update({'font.size': 12,
                           'figure.figsize': [4.5, 3],
                           'lines.markeredgewidth': 0,
                           'lines.markersize': 2
                           })
# Register time series plotting in pandas > 1.0
from pandas.plotting import register_matplotlib_converters
register_matplotlib_converters()
[3]:
# Set the random seed for numpy to ensure consistent results
np.random.seed(0)

0: Import and preliminary calculations

This section prepares the data necessary for an rdtools calculation. The first step of the rdtools workflow is normalization, which requires a time series of energy yield, a time series of cell temperature, and a time series of irradiance, along with some metadata (see Step 1: Normalize)

The following section loads the data, adjusts units where needed, and renames the critical columns. The ambient temperature sensor data source is converted into estimated cell temperature. This dataset already has plane-of-array irradiance data, so no transposition is necessary.

A common challenge is handling datasets with and without daylight savings time. Make sure to specify a pytz timezone that does or does not include daylight savings time as appropriate for your dataset.

The steps of this section may change depending on your data source or the system being considered. Transposition of irradiance and modeling of cell temperature are generally outside the scope of rdtools. A variety of tools for these calculations are available in pvlib.

[4]:
# Import the example data
file_url = ('https://datahub.duramat.org/dataset/a49bb656-7b36-'
            '437a-8089-1870a40c2a7d/resource/5059bc22-640d-4dd4'
            '-b7b1-1e71da15be24/download/pvdaq_system_4_2010-2016'
            '_subset_soilsignal.csv')
cache_file = 'PVDAQ_system_4_2010-2016_subset_soilsignal.pickle'

try:
    df = pd.read_pickle(cache_file)
except FileNotFoundError:
    df = pd.read_csv(file_url, index_col=0, parse_dates=True)
    df.to_pickle(cache_file)

df = df.rename(columns = {
    'ac_power':'power_ac',
    'wind_speed': 'wind_speed',
    'ambient_temp': 'Tamb',
    'poa_irradiance': 'poa',
})

# Specify the Metadata
meta = {"latitude": 39.7406,
        "longitude": -105.1774,
        "timezone": 'Etc/GMT+7',
        "gamma_pdc": -0.005,
        "azimuth": 180,
        "tilt": 40,
        "power_dc_rated": 1000.0,
        "temp_model_params":
        pvlib.temperature.TEMPERATURE_MODEL_PARAMETERS['sapm']['open_rack_glass_polymer']}

df.index = df.index.tz_localize(meta['timezone'])

loc = pvlib.location.Location(meta['latitude'], meta['longitude'], tz = meta['timezone'])
sun = loc.get_solarposition(df.index)

# There is some missing data, but we can infer the frequency from
# the first several data points
freq = pd.infer_freq(df.index[:10])

# Then set the frequency of the dataframe.
# It is recommended not to up- or downsample at this step
# but rather to use interpolate to regularize the time series
# to its dominant or underlying frequency. Interpolate is not
# generally recommended for downsampling in this application.
df = rdtools.interpolate(df, freq)

# Calculate cell temperature
df['Tcell'] = pvlib.temperature.sapm_cell(df.poa, df.Tamb,
                                          df.wind_speed, **meta['temp_model_params'])

# plot the AC power time series
fig, ax = plt.subplots(figsize=(4,3))
ax.plot(df.index, df.power_ac, 'o', alpha=0.01)
ax.set_ylim(0,1500)
fig.autofmt_xdate()
ax.set_ylabel('AC Power (W)');
_images/examples_degradation_and_soiling_example_pvdaq_4_5_0.png

1: Normalize

Data normalization is achieved with rdtools.normalize_with_expected_power(). This function can be used to normalize to any modeled or expected power. Note that realized PV output can be given as energy, rather than power, by using an optional key word argument.

[5]:
# Calculate the expected power with a simple PVWatts DC model
modeled_power = pvlib.pvsystem.pvwatts_dc(df['poa'], df['Tcell'], meta['power_dc_rated'],
                                          meta['gamma_pdc'], 25.0 )

# Calculate the normalization, the function also returns the relevant insolation for
# each point in the normalized PV energy timeseries
normalized, insolation = rdtools.normalize_with_expected_power(df['power_ac'],
                                                               modeled_power,
                                                               df['poa'])

df['normalized'] = normalized
df['insolation'] = insolation

# Plot the normalized power time series
fig, ax = plt.subplots()
ax.plot(normalized.index, normalized, 'o', alpha = 0.05)
ax.set_ylim(0,2)
fig.autofmt_xdate()
ax.set_ylabel('Normalized energy');
_images/examples_degradation_and_soiling_example_pvdaq_4_7_0.png

2: Filter

Data filtering is used to exclude data points that represent invalid data, create bias in the analysis, or introduce significant noise.

It can also be useful to remove outages and outliers. Sometimes outages appear as low but non-zero yield. Automatic functions for outage detection are not yet included in rdtools. However, this example does filter out data points where the normalized energy is less than 1%. System-specific filters should be implemented by the analyst if needed.

[6]:
# Calculate a collection of boolean masks that can be used
# to filter the time series
normalized_mask = rdtools.normalized_filter(df['normalized'])
poa_mask = rdtools.poa_filter(df['poa'])
tcell_mask = rdtools.tcell_filter(df['Tcell'])
# Note: This clipping mask may be disabled when you are sure the system is not
# experiencing clipping due to high DC/AC ratio
clip_mask = rdtools.clip_filter(df['power_ac'])

# filter the time series and keep only the columns needed for the
# remaining steps
filtered = df[normalized_mask & poa_mask & tcell_mask & clip_mask]
filtered = filtered[['insolation', 'normalized']]

fig, ax = plt.subplots()
ax.plot(filtered.index, filtered.normalized, 'o', alpha = 0.05)
ax.set_ylim(0,2)
fig.autofmt_xdate()
ax.set_ylabel('Normalized energy');
_images/examples_degradation_and_soiling_example_pvdaq_4_9_0.png

3: Aggregate

Data is aggregated with an irradiance weighted average. This can be useful, for example with daily aggregation, to reduce the impact of high-error data points in the morning and evening.

[7]:
daily = rdtools.aggregation_insol(filtered.normalized, filtered.insolation,
                                  frequency = 'D')

fig, ax = plt.subplots()
ax.plot(daily.index, daily, 'o', alpha = 0.1)
ax.set_ylim(0,2)
fig.autofmt_xdate()
ax.set_ylabel('Normalized energy');
_images/examples_degradation_and_soiling_example_pvdaq_4_11_0.png

4: Degradation calculation

Data is then analyzed to estimate the degradation rate representing the PV system behavior. The results are visualized and statistics are reported, including the 68.2% confidence interval, and the P95 exceedance value.

[8]:
# Calculate the degradation rate using the YoY method
yoy_rd, yoy_ci, yoy_info = rdtools.degradation_year_on_year(daily, confidence_level=68.2)
# Note the default confidence_level of 68.2 is appropriate if you would like to
# report a confidence interval analogous to the standard deviation of a normal
# distribution. The size of the confidence interval is adjustable by setting the
# confidence_level variable.

# Visualize the results

degradation_fig = rdtools.degradation_summary_plots(
    yoy_rd, yoy_ci, yoy_info, daily,
    summary_title='Sensor-based degradation results',
    scatter_ymin=0.5, scatter_ymax=1.1,
    hist_xmin=-30, hist_xmax=45, bins=100
)
_images/examples_degradation_and_soiling_example_pvdaq_4_13_0.png

In addition to the confidence interval, the year-on-year method yields an exceedance value (e.g. P95), the degradation rate that was exceeded (slower degradation) with a given probability level. The probability level is set via the exceedance_prob keyword in degradation_year_on_year.

[9]:
print('The P95 exceedance level is %.2f%%/yr' % yoy_info['exceedance_level'])
The P95 exceedance level is -0.63%/yr

5: Soiling calculations

This section illustrates how the aggregated data can be used to estimate soiling losses using the stochastic rate and recovery (SRR) method.¹ Since our example system doesn’t experience much soiling, we apply an artificially generated soiling signal, just for the sake of example.

¹ M. G. Deceglie, L. Micheli and M. Muller, “Quantifying Soiling Loss Directly From PV Yield,” IEEE Journal of Photovoltaics, vol. 8, no. 2, pp. 547-551, March 2018. doi: 10.1109/JPHOTOV.2017.2784682

[10]:
# Apply artificial soiling signal for example
# be sure to remove this for applications on real data,
# and proceed with analysis on `daily` instead of `soiled_daily`

soiling = df['soiling'].resample('D').mean()
soiled_daily = soiling*daily
[11]:
# Calculate the daily insolation, required for the SRR calculation
daily_insolation = filtered['insolation'].resample('D').sum()

# Perform the SRR calculation
from rdtools.soiling import soiling_srr
cl = 68.2
sr, sr_ci, soiling_info = soiling_srr(soiled_daily, daily_insolation,
                                      confidence_level=cl)
/Users/mdecegli/Documents/GitHub/rdtools/rdtools/soiling.py:15: UserWarning: The soiling module is currently experimental. The API, results, and default behaviors may change in future releases (including MINOR and PATCH releases) as the code matures.
  'The soiling module is currently experimental. The API, results, '
[12]:
print('The P50 insolation-weighted soiling ratio is %0.3f'%sr)
The P50 insolation-weighted soiling ratio is 0.945
[13]:
print('The %0.1f confidence interval for the insolation-weighted'
      ' soiling ratio is %0.3f%0.3f'%(cl, sr_ci[0], sr_ci[1]))
The 68.2 confidence interval for the insolation-weighted soiling ratio is 0.939–0.951
[14]:
# Plot Monte Carlo realizations of soiling profiles
fig = rdtools.plotting.soiling_monte_carlo_plot(soiling_info, soiled_daily, profiles=200);
/Users/mdecegli/Documents/GitHub/rdtools/rdtools/plotting.py:151: UserWarning: The soiling module is currently experimental. The API, results, and default behaviors may change in future releases (including MINOR and PATCH releases) as the code matures.
  'The soiling module is currently experimental. The API, results, '
_images/examples_degradation_and_soiling_example_pvdaq_4_21_1.png
[15]:
# Plot the slopes for "valid" soiling intervals identified,
# assuming perfect cleaning events
fig = rdtools.plotting.soiling_interval_plot(soiling_info, soiled_daily);
/Users/mdecegli/Documents/GitHub/rdtools/rdtools/plotting.py:211: UserWarning: The soiling module is currently experimental. The API, results, and default behaviors may change in future releases (including MINOR and PATCH releases) as the code matures.
  'The soiling module is currently experimental. The API, results, '
_images/examples_degradation_and_soiling_example_pvdaq_4_22_1.png
[16]:
# View the first several rows of the soiling interval summary table
soiling_summary = soiling_info['soiling_interval_summary']
soiling_summary.head()
[16]:
start end soiling_rate soiling_rate_low soiling_rate_high inferred_start_loss inferred_end_loss length valid
0 2010-02-25 00:00:00-07:00 2010-03-06 00:00:00-07:00 0.000000 0.000000 0.000000 0.685379 0.863517 9 False
1 2010-03-07 00:00:00-07:00 2010-03-11 00:00:00-07:00 0.000000 0.000000 0.000000 1.053439 1.003025 4 False
2 2010-03-12 00:00:00-07:00 2010-04-08 00:00:00-07:00 -0.002505 -0.005069 0.000000 1.058785 0.991144 27 True
3 2010-04-09 00:00:00-07:00 2010-04-11 00:00:00-07:00 0.000000 0.000000 0.000000 1.044975 1.044975 2 False
4 2010-04-12 00:00:00-07:00 2010-06-15 00:00:00-07:00 -0.000594 -0.000997 -0.000174 1.011211 0.973207 64 True
[17]:
# View a histogram of the valid soiling rates found for the data set
fig = rdtools.plotting.soiling_rate_histogram(soiling_info, bins=50)
/Users/mdecegli/Documents/GitHub/rdtools/rdtools/plotting.py:251: UserWarning: The soiling module is currently experimental. The API, results, and default behaviors may change in future releases (including MINOR and PATCH releases) as the code matures.
  'The soiling module is currently experimental. The API, results, '
_images/examples_degradation_and_soiling_example_pvdaq_4_24_1.png

These plots show generally good results from the SRR method. In this example, we have slightly overestimated the soiling loss because we used the default behavior of the method key word argument in rdtools.soiling_srr(), which does not assume that every cleaning is perfect but the example artificial soiling signal did include perfect cleaning. We encourage you to adjust the options of rdtools.soiling_srr() for your application.

[18]:
# Calculate and view a monthly soiling rate summary
from rdtools.soiling import monthly_soiling_rates
monthly_soiling_rates(soiling_info['soiling_interval_summary'],
                      confidence_level=cl)
[18]:
month soiling_rate_median soiling_rate_low soiling_rate_high interval_count
0 1 -0.000942 -0.001954 -0.000692 6
1 2 -0.001794 -0.006180 -0.000752 7
2 3 -0.001096 -0.002230 -0.000394 10
3 4 -0.000924 -0.001899 -0.000122 9
4 5 -0.000305 -0.000733 -0.000086 7
5 6 -0.000331 -0.000777 -0.000091 8
6 7 -0.000404 -0.001342 -0.000140 8
7 8 -0.000674 -0.001779 -0.000182 7
8 9 -0.000856 -0.001572 -0.000191 8
9 10 -0.000881 -0.001413 -0.000203 8
10 11 -0.000920 -0.001894 -0.000229 8
11 12 -0.000947 -0.002455 -0.000691 6
[19]:
# Calculate and view annual insolation-weighted soiling ratios and their confidence
# intervals based on the Monte Carlo simulation. Note that these losses include the
# assumptions of the cleaning assumptions associated with the method parameter
# of rdtools.soiling_srr(). For anything but 'perfect_clean', each year's soiling
# ratio may be impacted by prior years' soiling profiles. The default behavior of
# rdtools.soiling_srr uses method='half_norm_clean'

from rdtools.soiling import annual_soiling_ratios
annual_soiling_ratios(soiling_info['stochastic_soiling_profiles'],
                      daily_insolation,
                      confidence_level=cl)
[19]:
year soiling_ratio_median soiling_ratio_low soiling_ratio_high
0 2010 0.961769 0.950512 0.969079
1 2011 0.944563 0.937086 0.950570
2 2012 0.939465 0.931211 0.945439
3 2013 0.954355 0.944595 0.961878
4 2014 0.949834 0.929179 0.965085
5 2015 0.950557 0.921117 0.966028
6 2016 0.937150 0.925213 0.944815

Clear sky workflow

The clear sky workflow is useful in that it avoids problems due to drift or recalibration of ground-based sensors. We use pvlib to model the clear sky irradiance. This is renormalized to align it with ground-based measurements. Finally we use rdtools.get_clearsky_tamb() to model the ambient temperature on clear sky days. This modeled ambient temperature is used to model cell temperature with pvlib. If high quality ambient temperature data is available, that can be used instead of the modeled ambient; we proceed with the modeled ambient temperature here for illustrative purposes.

In this example, note that we have omitted wind data in the cell temperature calculations for illustrative purposes. Wind data can also be included when the data source is trusted for improved results

We generally recommend that the clear sky workflow be used as a check on the sensor workflow. It tends to be more sensitive than the sensor workflow, and thus we don’t recommend it as a stand-alone analysis.

Note that the calculations below rely on some objects from the steps above

Clear Sky 0: Preliminary Calculations

[20]:
# Calculate the clear sky POA irradiance
clearsky = loc.get_clearsky(df.index, solar_position=sun)

cs_sky = pvlib.irradiance.isotropic(meta['tilt'], clearsky.dhi)
cs_beam = pvlib.irradiance.beam_component(meta['tilt'], meta['azimuth'],
                                          sun.zenith, sun.azimuth, clearsky.dni)
df['clearsky_poa'] = cs_beam + cs_sky

# Renormalize the clear sky POA irradiance
df['clearsky_poa'] = rdtools.irradiance_rescale(df.poa, df.clearsky_poa,
                                                method='iterative')

# Calculate the clearsky temperature
df['clearsky_Tamb'] = rdtools.get_clearsky_tamb(df.index, meta['latitude'],
                                                meta['longitude'])
df['clearsky_Tcell'] = pvlib.temperature.sapm_cell(df.clearsky_poa, df.clearsky_Tamb,
                                                   0, **meta['temp_model_params'])

Clear Sky 1: Normalize

Normalize as in step 1 above, but this time using clearsky modeled irradiance and cell temperature

[21]:
# Calculate the expected power with a simple PVWatts DC model
clearsky_modeled_power = pvlib.pvsystem.pvwatts_dc(df['clearsky_poa'],
                                                   df['clearsky_Tcell'],
                                                   meta['power_dc_rated'], meta['gamma_pdc'], 25.0 )

# Calculate the normalization, the function also returns the relevant insolation for
# each point in the normalized PV energy timeseries
clearsky_normalized, clearsky_insolation = rdtools.normalize_with_expected_power(
    df['power_ac'],
    clearsky_modeled_power,
    df['clearsky_poa']
)

df['clearsky_normalized'] = clearsky_normalized
df['clearsky_insolation'] = clearsky_insolation

Clear Sky 2: Filter

Filter as in step 2 above, but with the addition of a clear sky index (csi) filter so we consider only points well modeled by the clear sky irradiance model.

[22]:
# Perform clearsky filter
cs_normalized_mask = rdtools.normalized_filter(df['clearsky_normalized'])
cs_poa_mask = rdtools.poa_filter(df['clearsky_poa'])
cs_tcell_mask = rdtools.tcell_filter(df['clearsky_Tcell'])

csi_mask = rdtools.csi_filter(df.insolation, df.clearsky_insolation)

clearsky_filtered = df[cs_normalized_mask & cs_poa_mask & cs_tcell_mask &
                       clip_mask & csi_mask]
clearsky_filtered = clearsky_filtered[['clearsky_insolation', 'clearsky_normalized']]

Clear Sky 3: Aggregate

Aggregate the clear sky version of of the filtered data

[23]:
clearsky_daily = rdtools.aggregation_insol(clearsky_filtered.clearsky_normalized,
                                           clearsky_filtered.clearsky_insolation)

Clear Sky 4: Degradation Calculation

Estimate the degradation rate and compare to the results obtained with sensors. In this case, we see that the degradation rate estimated with the clearsky methodology is not far off from the sensor-based estimate.

[24]:
# Calculate the degradation rate using the YoY method
cs_yoy_rd, cs_yoy_ci, cs_yoy_info = rdtools.degradation_year_on_year(
    clearsky_daily,
    confidence_level=68.2
)

# Note the default confidence_level of 68.2 is appropriate if you would like to
# report a confidence interval analogous to the standard deviation of a normal
# distribution. The size of the confidence interval is adjustable by setting the
# confidence_level variable.

# Visualize the results
clearsky_fig = rdtools.degradation_summary_plots(
    cs_yoy_rd, cs_yoy_ci, cs_yoy_info, clearsky_daily,
    summary_title='Clear-sky-based degradation results',
    scatter_ymin=0.5, scatter_ymax=1.1,
    hist_xmin=-30, hist_xmax=45, plot_color='orangered',
    bins=100);

print('The P95 exceedance level with the clear sky analysis is %.2f%%/yr' %
      cs_yoy_info['exceedance_level'])
The P95 exceedance level with the clear sky analysis is -0.81%/yr
_images/examples_degradation_and_soiling_example_pvdaq_4_38_1.png
[25]:
# Compare to previous sensor results
degradation_fig
[25]:
_images/examples_degradation_and_soiling_example_pvdaq_4_39_0.png

System availability example

This notebook shows example usage of the inverter availability functions. As with the degradation and soiling example, we recommend installing the specific versions of packages used to develop this notebook. This can be achieved in your environment by running pip install -r requirements.txt followed by pip install -r docs/notebook_requirements.txt from the base directory. (RdTools must also be separately installed.) These environments and examples are tested with Python 3.7.

RdTools currently implements two methods of quantifying system availability. The first method compares power measurements from inverters and the system meter to distinguish subsystem communication interruptions from true outage events. The second method determines the uncertainty bounds around an energy estimate of a total system outage and compares with true production calculated from a meter’s cumulative production measurements. The RdTools AvailabilityAnalysis class uses both methods to quantify downtime loss.

These methods are described in K. Anderson and R. Blumenthal, “Overcoming Communications Outages in Inverter Downtime Analysis”, 2020 IEEE 47th Photovoltaic Specialists Conference (PVSC).

[1]:
import rdtools
import pvlib

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

Quantifying the production impact of inverter downtime events is complicated by gaps in a system’s historical data caused by communication interruptions. Although communication interruptions may prevent remote operation, they usually do not result in production loss. Accurate production loss estimates require the ability to distinguish true outages from communication interruptions.

The first method focuses on partial outages where some of a system’s inverters are reporting production and some are not. In these cases, the method examines the AC power measurements at the inverter and system meter level to classify each timestamp individually and estimate timeseries production loss. This level of granularity is made possible by comparing timeseries power measurements between inverters and the meter.

Create a test dataset

First we’ll generate a test dataset to demonstrate the method. This code block just puts together an artificial dataset to use for the analysis – feel free to skip ahead to where it gets plotted.

[2]:
def make_dataset():
    """
    Make an example dataset with several types of data outages for availability analysis.

    Returns
    -------
    df_reported : pd.DataFrame
        Simulated data as a data acquisition system would report it, including the
        effect of communication interruptions.
    df_secret : pd.DataFrame
        The secret true data of the system, not affected by communication
        interruptions.  Only used for comparison with the analysis output.
    expected_power : pd.Series
        An "expected" power signal for this hypothetical PV system, simulating a
        modeled power from satellite weather data or some other method.

    (This function creates instantaneous data. SystemAvailability is technically designed
    to work with right-labeled averages. However, for the purposes of the example, the
    approximation is suitable.)
    """

    # generate a plausible clear-sky power signal
    times = pd.date_range('2019-01-01', '2019-01-12', freq='15min', tz='US/Eastern',
                          closed='left')
    location = pvlib.location.Location(40, -80)
    clearsky = location.get_clearsky(times, model='haurwitz')
    # just scale GHI to power for simplicity
    base_power = 2.5*clearsky['ghi']
    # but require a minimum irradiance to turn on, simulating start-up voltage
    base_power[clearsky['ghi'] < 20] = 0

    df_secret = pd.DataFrame({
        'inv1_power': base_power,
        'inv2_power': base_power * 1.5,
        'inv3_power': base_power * 0.66,
    })

    # set the expected_power to be pretty close to actual power,
    # but with some autocorrelated noise and a bias:
    expected_power = df_secret.sum(axis=1)
    np.random.seed(2020)
    N = len(times)
    expected_power *= 0.9 - (0.3 * np.sin(np.arange(0, N)/7 +
                             np.random.normal(0, 0.2, size=N)))

    # Add a few days of individual inverter outages:
    df_secret.loc['2019-01-03':'2019-01-05', 'inv2_power'] = 0
    df_secret.loc['2019-01-02', 'inv3_power'] = 0
    df_secret.loc['2019-01-07 00:00':'2019-01-07 12:00', 'inv1_power'] = 0

    # and a full system outage:
    full_outage_date = '2019-01-08'
    df_secret.loc[full_outage_date, :] = 0

    # calculate the system meter power and cumulative production,
    # including the effect of the outages:
    df_secret['meter_power'] = df_secret.sum(axis=1)
    interval_energy = rdtools.energy_from_power(df_secret['meter_power'])
    df_secret['meter_energy'] = interval_energy.cumsum()
    # fill the first NaN from the cumsum with 0
    df_secret['meter_energy'] = df_secret['meter_energy'].fillna(0)
    # add an offset to reflect previous production:
    df_secret['meter_energy'] += 5e5
    # calculate cumulative energy for an inverter as well:
    inv2_energy = rdtools.energy_from_power(df_secret['inv2_power'])
    df_secret['inv2_energy'] = inv2_energy.cumsum().fillna(0)

    # now that the "true" data is in place, let's add some communications interruptions:
    df_reported = df_secret.copy()
    # in full outages, we lose all the data:
    df_reported.loc[full_outage_date, :] = np.nan
    # add a communications interruption that overlaps with an inverter outage:
    df_reported.loc['2019-01-05':'2019-01-06', 'inv1_power'] = np.nan
    # and a communication outage that affects everything:
    df_reported.loc['2019-01-10', :] = np.nan

    return df_reported, df_secret, expected_power

Let’s visualize the dataset before analyzing it with RdTools. The dotted lines show the “true” data that wasn’t recorded by the datalogger because of interrupted communications.

[3]:
df, df_secret, expected_power = make_dataset()

fig, axes = plt.subplots(3, 1, sharex=True, figsize=(8,6))
colors = plt.rcParams['axes.prop_cycle'].by_key()['color'][:3]

# inverter power
df_secret[['inv1_power', 'inv2_power', 'inv3_power']].plot(ax=axes[0],
                                                           legend=False, ls=':',
                                                           color=colors)
df[['inv1_power', 'inv2_power', 'inv3_power']].plot(ax=axes[0], legend=False)
# meter power
df_secret['meter_power'].plot(ax=axes[1], ls=':', color=colors[0])
df['meter_power'].plot(ax=axes[1])
# meter cumulative energy
df_secret['meter_energy'].plot(ax=axes[2], ls=':', color=colors[0])
df['meter_energy'].plot(ax=axes[2])

axes[0].set_ylabel('Inverter Power [kW]')
axes[1].set_ylabel('Meter Power [kW]')
axes[2].set_ylabel('Cumulative\nMeter Energy [kWh]')
plt.show()
_images/examples_system_availability_example_5_0.png

Note that the solid lines show the data that would be available in our example while the dotted lines show the true underlying behavior that we normally wouldn’t know.

If we hadn’t created this dataset ourselves, it wouldn’t necessarily be obvious why the meter shows low or no production on some days – maybe it was just cloudy weather, maybe it was a nuisance communication outage (broken cell modem power supply, for example), or maybe it was a true power outage. This example also shows how an inverter can appear to be offline while actually producing normally. For example, just looking at inverter power on the 5th, it appears that only the small inverter is producing. However, the meter shows two inverters’ worth of production. Similarly, the 6th shows full meter production despite one inverter not reporting power. Using only the inverter-reported power would overestimate the production loss because of the communication interruption.

System availability analysis

Now we’ll hand this data off to RdTools for analysis:

[4]:
from rdtools.availability import AvailabilityAnalysis
aa = AvailabilityAnalysis(
    power_system=df['meter_power'],
    power_subsystem=df[['inv1_power', 'inv2_power', 'inv3_power']],
    energy_cumulative=df['meter_energy'],
    power_expected=expected_power,
)
# identify and classify outages, rolling up to daily metrics for this short dataset:
aa.run(rollup_period='D')
/Users/mdecegli/opt/anaconda3/envs/final_release_test/lib/python3.7/site-packages/rdtools/availability.py:18: UserWarning: The availability module is currently experimental. The API, results, and default behaviors may change in future releases (including MINOR and PATCH releases) as the code matures.
  'The availability module is currently experimental. The API, results, '

First, we can visualize the estimated power loss and outage information:

[5]:
fig = aa.plot()
fig.set_size_inches(16, 7)
fig.axes[1].legend(loc='upper left');
/Users/mdecegli/opt/anaconda3/envs/final_release_test/lib/python3.7/site-packages/rdtools/plotting.py:320: UserWarning: The availability module is currently experimental. The API, results, and default behaviors may change in future releases (including MINOR and PATCH releases) as the code matures.
  'The availability module is currently experimental. The API, results, '
_images/examples_system_availability_example_9_1.png

Examining the plot of estimated lost power, we can see that the estimated loss is roughly in proportion to the amount of offline capacity. In particular, the loss estimate is robust to mixed outage and communication interruption like on the 5th when only the smallest inverter is reporting production but the analysis correctly inferred that one of the other inverters is producing but not communicating.

RdTools also reports rolled-up production and availability metrics:

[6]:
pd.set_option('precision', 3)
aa.results
[6]:
lost_production actual_production availability
2019-01-01 00:00:00-05:00 0.000 19606.785 1.000
2019-01-02 00:00:00-05:00 4114.031 15583.450 0.791
2019-01-03 00:00:00-05:00 9396.788 10399.112 0.525
2019-01-04 00:00:00-05:00 9466.477 10476.235 0.525
2019-01-05 00:00:00-05:00 9522.325 10538.040 0.525
2019-01-06 00:00:00-05:00 0.000 20185.784 1.000
2019-01-07 00:00:00-05:00 2859.565 17459.339 0.859
2019-01-08 00:00:00-05:00 19448.084 0.000 0.000
2019-01-09 00:00:00-05:00 0.000 20607.950 1.000
2019-01-10 00:00:00-05:00 0.000 20763.718 1.000
2019-01-11 00:00:00-05:00 0.000 20926.869 1.000

The AvailabilityAnalysis object has other attributes that may be useful to inspect as well. The outage_info dataframe has one row for each full system outage with several columns, perhaps the most interesting of which are type and loss.

See AvailabilityAnalysis? or help(AvailabilityAnalysis) for full descriptions of the available attributes.

[7]:
pd.set_option('precision', 2)
# Show the first half of the dataframe
N = len(aa.outage_info.columns)
aa.outage_info.iloc[:, :N//2]
[7]:
start end duration intervals daylight_intervals error_lower error_upper
0 2019-01-07 17:00:00-05:00 2019-01-09 08:00:00-05:00 1 days 15:00:00 157 35 -0.24 0.25
1 2019-01-09 17:00:00-05:00 2019-01-11 08:00:00-05:00 1 days 15:00:00 157 35 -0.24 0.25
[8]:
# Show the second half
aa.outage_info.iloc[:, N//2:]
[8]:
energy_expected energy_start energy_end energy_actual ci_lower ci_upper type loss
0 19448.08 604248.74 604248.74 0.00 14819.33 24271.15 real 19448.08
1 25284.75 624856.69 645620.41 20763.72 19266.84 31555.29 comms 0.00

Other use cases

Although this demo applies the methods for an entire PV system (comparing inverters against the meter and comparing the meter against expected power), it can also be used at the individual inverter level. Because there are no subsystems to compare against, the “full outage” analysis branch is used for every outage. That means that instead of basing the loss off of the other inverters, it relies on the expected power time series being accurate, which in this example causes the loss estimates to lose some accuracy. In this case, because the expected power signal is somewhat inaccurate, it causes the loss estimate to be overestimated:

[9]:
# make a new analysis object:
aa2 = rdtools.availability.AvailabilityAnalysis(
    power_system=df['inv2_power'],
    power_subsystem=df['inv2_power'].to_frame(),
    energy_cumulative=df['inv2_energy'],
    # okay to use the system-level expected power here because it gets rescaled anyway
    power_expected=expected_power,
)
# identify and classify outages, rolling up to daily metrics for this short dataset:
aa2.run(rollup_period='D')
print(aa2.results['lost_production'])
2019-01-01 00:00:00-05:00        0.00
2019-01-02 00:00:00-05:00        0.00
2019-01-03 00:00:00-05:00     9931.24
2019-01-04 00:00:00-05:00    11453.27
2019-01-05 00:00:00-05:00    11238.57
2019-01-06 00:00:00-05:00        0.00
2019-01-07 00:00:00-05:00        0.00
2019-01-08 00:00:00-05:00     9505.33
2019-01-09 00:00:00-05:00        0.00
2019-01-10 00:00:00-05:00        0.00
2019-01-11 00:00:00-05:00        0.00
Freq: D, Name: lost_production, dtype: float64
[10]:
aa2.plot();
/Users/mdecegli/opt/anaconda3/envs/final_release_test/lib/python3.7/site-packages/rdtools/plotting.py:320: UserWarning: The availability module is currently experimental. The API, results, and default behaviors may change in future releases (including MINOR and PATCH releases) as the code matures.
  'The availability module is currently experimental. The API, results, '
_images/examples_system_availability_example_17_1.png

API reference

Submodules

RdTools is organized into submodules focused on different parts of the data analysis workflow.

degradation Functions for calculating the degradation rate of photovoltaic systems.
soiling Functions for calculating soiling metrics from photovoltaic system data.
availability Functions for detecting and quantifying production loss from photovoltaic system downtime events.
filtering Functions for filtering and subsetting PV system data.
normalization Functions for normalizing, rescaling, and regularizing PV system data.
aggregation Functions for calculating weighted aggregates of PV system data.
clearsky_temperature Functions for estimating clear-sky ambient temperature.
plotting Functions for plotting degradation and soiling analysis results.

Degradation

Functions for calculating the degradation rate of photovoltaic systems.

degradation_classical_decomposition(...[, ...]) Estimate the trend of a timeseries using a classical decomposition approach (moving average) and calculate various statistics, including the result of a Mann-Kendall test and a Monte Carlo-derived confidence interval of slope.
degradation_ols(energy_normalized[, ...]) Estimate the trend of a timeseries using ordinary least-squares regression and calculate various statistics including a Monte Carlo-derived confidence interval of slope.
degradation_year_on_year(energy_normalized) Estimate the trend of a timeseries using the year-on-year decomposition approach and calculate a Monte Carlo-derived confidence interval of slope.

Soiling

Functions for calculating soiling metrics from photovoltaic system data.

The soiling module is currently experimental. The API, results, and default behaviors may change in future releases (including MINOR and PATCH releases) as the code matures.

soiling_srr(energy_normalized_daily, ...[, ...]) Functional wrapper for SRRAnalysis.
monthly_soiling_rates(soiling_interval_summary) Use Monte Carlo to calculate typical monthly soiling rates.
annual_soiling_ratios(...[, confidence_level]) Return annualized soiling ratios and associated confidence intervals based on stochastic soiling profiles from SRR.
SRRAnalysis(energy_normalized_daily, ...[, ...]) Class for running the stochastic rate and recovery (SRR) photovoltaic soiling loss analysis presented in Deceglie et al.
SRRAnalysis.run([reps, day_scale, ...]) Run the SRR method from beginning to end.

System Availability

Functions for detecting and quantifying production loss from photovoltaic system downtime events.

The availability module is currently experimental. The API, results, and default behaviors may change in future releases (including MINOR and PATCH releases) as the code matures.

AvailabilityAnalysis(power_system, ...) A class to perform system availability and loss analysis.
AvailabilityAnalysis.run([low_threshold, ...]) Run the availability analysis.
AvailabilityAnalysis.plot() Create a figure summarizing the availability analysis results.

Filtering

Functions for filtering and subsetting PV system data.

clip_filter(power_ac[, quantile]) Filter data points likely to be affected by clipping with power greater than or equal to 99% of the quant quantile.
csi_filter(poa_global_measured, ...[, threshold]) Filtering based on clear-sky index (csi)
poa_filter(poa_global[, poa_global_low, ...]) Filter POA irradiance readings outside acceptable measurement bounds.
tcell_filter(temperature_cell[, ...]) Filter temperature readings outside acceptable measurement bounds.
normalized_filter(energy_normalized[, ...]) Select normalized yield between low_cutoff and high_cutoff

Normalization

Functions for normalizing, rescaling, and regularizing PV system data.

energy_from_power(power[, target_frequency, ...]) Returns a regular right-labeled energy time series in units of Wh per interval from a power time series.
interpolate(time_series, target[, ...]) Returns an interpolation of time_series, excluding times associated with gaps in each column of time_series longer than max_timedelta; NaNs are returned within those gaps.
irradiance_rescale(irrad, irrad_sim[, ...]) Attempt to rescale modeled irradiance to match measured irradiance on clear days.
normalize_with_expected_power(pv, ...[, ...]) Normalize PV power or energy based on expected PV power.
normalize_with_pvwatts(energy, pvwatts_kws) Normalize system AC energy output given measured poa_global and meteorological data.
normalize_with_sapm(energy, sapm_kws)

Deprecated since version 2.0.0.

pvwatts_dc_power(poa_global, power_dc_rated) PVWatts v5 Module Model: DC power given effective poa poa_global, module nameplate power, and cell temperature.
sapm_dc_power(pvlib_pvsystem, met_data)

Deprecated since version 2.0.0.

delta_index(series)

Deprecated since version 2.0.0.

check_series_frequency(series, ...)

Deprecated since version 2.0.0.

Aggregation

Functions for calculating weighted aggregates of PV system data.

aggregation_insol(energy_normalized, insolation) Insolation weighted aggregation

Clear-Sky Temperature

Functions for estimating clear-sky ambient temperature.

get_clearsky_tamb(times, latitude, longitude) Estimates the ambient temperature at latitude and longitude for the given times using a Gaussian rolling window.

Plotting

Functions for plotting degradation and soiling analysis results.

degradation_summary_plots(yoy_rd, yoy_ci, ...) Create plots (scatter plot and histogram) that summarize degradation analysis results.
soiling_monte_carlo_plot(soiling_info, ...) Create figure to visualize Monte Carlo of soiling profiles used in the SRR analysis.
soiling_interval_plot(soiling_info, ...[, ...]) Create figure to visualize valid soiling profiles used in the SRR analysis.
soiling_rate_histogram(soiling_info[, bins]) Create histogram of soiling rates found in the SRR analysis.
availability_summary_plots(power_system, ...) Create a figure summarizing the availability analysis results.

RdTools Change Log

v2.0.5 (December 30, 2020)

Testing

  • Add a flake8 code style check to the continuous integration checks (GH #231)
  • Moved several pytest fixtures from soiling_test.py and availability_test.py to conftest.py so that they are shared across test files (GH #231)
  • Add Python 3.9 to CI testing (GH #249)
  • Fix test suite error raised when using pandas 1.2.0 (GH #251)

Documentation

  • Organized example notebooks into a sphinx gallery (GH #240)

Requirements

  • Add support for python 3.9 (GH #249)
  • Update requirements.txt versions for numpy, scipy, pandas, h5py and statsmodels to versions that have wheels available for python 3.6-3.9. Note that the minimum versions are unchanged. (GH #249).

Contributors

v2.0.4 (December 4, 2020)

Bug Fixes

  • Fix bug related to leading NaN values with energy_from_power(). This fixed a small normalization error in degradation_and_soiling_example.ipynb and slightly changed the clear-sky degradation results (GH #244, GH #245)

Contributors

v2.0.3 (November 20, 2020)

Requirements

  • Change to docs/notebook_requirements.txt: notebook version from 5.7.8 to 6.1.5 and terminado version from 0.8.1 to 0.8.3 (GH #239)

Contributors

v2.0.2 (November 17, 2020)

Examples

Contributors

v2.0.1 (October 30, 2020)

Deprecations

Contributors

v2.0.0 (October 20, 2020)

Version 2.0.0 adds experimental soiling and availability modules, plotting capability, and includes updates to normalization work flow. This major release introduces some breaking changes to the API. Details below.

API Changes

Enhancements

Bug fixes

Testing

  • Add Python 3.7 and 3.8 to CI testing (GH #135).
  • Add CI configuration based on the minimum dependency versions (GH #197)

Documentation

  • Create sphinx documentation and set up ReadTheDocs (GH #125).
  • Add guides on running tests and building sphinx docs (GH #136).
  • Improve module-level docstrings (GH #137).
  • Update landing page and add new "Inverter Downtime" documentation page based on the availability notebook (GH #131)

Requirements

  • Drop support for Python 2.7, minimum supported version is now 3.6 (GH #135).
  • Increase minimum pvlib version to 0.7.0 (GH #170)
  • Update requirements.txt and notebook_requirements.txt to avoid conflicting specifications. Taken together, they represent the complete environment for the notebook example (GH #164).
  • Add minimum matplotlib requirement of 3.0.0 (released September 18, 2018) (GH #197)
  • Increase minimum numpy version from 1.12 (released January 15, 2017) to 1.15 (released July 23, 2018) (GH #197)

Example Updates

  • Seed numpy.random to ensure repeatable results (GH #164).
  • Use normalized_filter() instead of manually filtering the normalized energy timeseries. Also updated the associated mask variable names (GH #139).
  • Add soiling section to the original example notebook.
  • Add a new example notebook that analyzes data from a PV system located at NREL's South Table Mountain campus (PVDAQ system #4) (GH #171).
  • Explicitly register pandas datetime converters which were deprecated.
  • Add new system_availability_example.ipynb notebook (GH #131)

Contributors

v1.2.3 (April 12, 2020)

  • Updates dependencies
  • Versioneer bug fix
  • Licence update

Contributors

v1.2.2 (October 12, 2018)

Patch that adds author email to enable pypi deployment

Contributors

v1.2.1 (October 12, 2018)

This update includes automated testing and deployment to support development along with some bug fixes to the library itself, a documented environment for the example notebook, and new example results to reflect changes in the example dataset. It addresses GH #49, GH #76, GH #78, GH #79, GH #80, GH #85, GH #86, and GH #92.

Contributors

v1.2.0 (March 30, 2018)

This incorporates changes including:

  • Enables users to control confidence intervals reported in degradation calculations (GH #59)
  • Adds python 3 support (GH #56 and GH #67)
  • Fixes bugs (GH #61 GH #57)
  • Improvements/typo fixes to docstrings
  • Fixes error in check for two years of data in degradation_year_on_year
  • Improves the calculations underlying irradiance_rescale

Contributors

v1.1.3 (December 6, 2017)

This patch includes the following changes:

  1. Update the notebook for improved plotting with Pandas v.0.21.0
  2. Fix installation bug related to package data

Contributors

v1.1.2 (November 6, 2017)

This patch includes the following changes:

  1. Fix bugs in installation
  2. Update requirements
  3. Notebook plots made compatible with pandas v.0.21.0

Contributors

v1.1.1 (November 1, 2017)

This patch:

  1. Improves documentation
  2. Fixes installation requirements

Contributors

v1.1.0 (September 30, 2017)

This update includes the addition of filters, functions to support a clear-sky workflow, and updates to the example notebook.

Contributors

Developer Notes

This page documents some of the workflows specific to RdTools development.

Installing RdTools source code

To make changes to RdTools, run the test suite, or build the documentation locally, you'll need to have a local copy of the git repository. Installing RdTools using pip will install a condensed version that doesn't include the full source code. To get the full source code, you'll need to clone the RdTools source repository from Github with e.g.

git clone https://github.com/NREL/rdtools.git

from the command line, or using a GUI git client like Github Desktop. This will clone the entire git repository onto your computer.

Installing RdTools dependencies

The packages necessary to run RdTools itself can be installed with pip. You can install the dependencies along with RdTools itself from PyPI:

pip install rdtools

This will install the latest official release of RdTools. If you want to work with a development version and you have cloned the Github repository to your computer, you can also install RdTools and dependencies by navigating to the repository root, switching to the branch you're interested in, for instance:

git checkout development

and running:

pip install .

This will install based on whatever RdTools branch you have checked out. You can check what version is currently installed by inspecting rdtools.__version__:

>>> rdtools.__version__
'1.2.0+188.g5a96bb2'

The hex string at the end represents the hash of the git commit for your installed version.

Installing optional dependencies

RdTools has extra dependencies for running its test suite and building its documentation. These packages aren't necessary for running RdTools itself and are only needed if you want to contribute source code to RdTools.

Note

These will install RdTools along with other packages necessary to build its documentation and run its test suite. We recommend doing this in a virtual environment to keep package installations between projects separate!

Optional dependencies can be installed with the special syntax:

pip install rdtools[test]  # test suite dependencies
pip install rdtools[doc]   # documentation dependencies

Or, if your local repository has an updated dependencies list:

pip install .[test]  # test suite dependencies
pip install .[doc]   # documentation dependencies

Running the test suite

RdTools uses pytest to run its test suite. If you haven't already, install the testing dependencies (Installing optional dependencies).

To run the entire test suite, navigate to the git repo folder and run

pytest

For convenience, pytest lets you run tests for a single module if you don't want to wait around for the entire suite to finish:

pytest rdtools/test/soiling_test.py

And even a single test function:

pytest rdtools/test/soiling_test.py::test_soiling_srr

You can also evaluate code coverage when running the test suite using the coverage package:

coverage run -m pytest
coverage report

The first line runs the test suite and keeps track of exactly what lines of code were run during test execution. The second line then prints out a summary report showing how much much of each source file was executed in the test suite. If a percentage is below 100, that means a function isn't tested or a branch inside a function isn't tested. To get specific details, you can run coverage html to generate a detailed HTML report at htmlcov/index.html to view in a browser.

Checking for code style

RdTools uses flake8 to validate code style. To run this check locally you'll need to have flake8 installed (see Installing optional dependencies). Then navigate to the git repo folder and run

flake8

Or, for a more detailed report:

flake8 --count --statistics --show-source

Building documentation locally

RdTools uses Sphinx to build its documentation. If you haven't already, install the documentation dependencies (Installing optional dependencies).

Once the required packages are installed, change your console's working directory to rdtools/docs/sphinx and run

make html

Note that on Windows, you don't actually need the make utility installed for this to work because there is a make.bat in this directory. Building the docs should result in output like this:

(venv)$ make html
Running Sphinx v1.8.5
making output directory...
[autosummary] generating autosummary for: api.rst, example.nblink, index.rst, readme_link.rst
[autosummary] generating autosummary for: C:\Users\KANDERSO\projects\rdtools\docs\sphinx\source\generated\rdtools.aggregation.aggregation_insol.rst, C:\Users\KANDERSO\projects\rdtools\docs\sphinx\source\generated\rdtools.aggregation.rst, C:\Users\KANDERSO\projects\rdtools\docs\sphinx\source\generated\rdtools.clearsky_temperature.get_clearsky_tamb.rst, C:\Users\KANDERSO\projects\rdtools\docs\sphinx\source\generated\rdtools.clearsky_temperature.rst, C:\Users\KANDERSO\projects\rdtools\docs\sphinx\source\generated\rdtools.degradation.degradation_classical_decomposition.rst, C:\Users\KANDERSO\projects\rdtools\docs\sphinx\source\generated\rdtools.degradation.degradation_ols.rst, C:\Users\KANDERSO\projects\rdtools\docs\sphinx\source\generated\rdtools.degradation.degradation_year_on_year.rst, C:\Users\KANDERSO\projects\rdtools\docs\sphinx\source\generated\rdtools.degradation.rst, C:\Users\KANDERSO\projects\rdtools\docs\sphinx\source\generated\rdtools.filtering.clip_filter.rst, C:\Users\KANDERSO\projects\rdtools\docs\sphinx\source\generated\rdtools.filtering.csi_filter.rst, ..., C:\Users\KANDERSO\projects\rdtools\docs\sphinx\source\generated\rdtools.normalization.normalize_with_pvwatts.rst, C:\Users\KANDERSO\projects\rdtools\docs\sphinx\source\generated\rdtools.normalization.normalize_with_sapm.rst, C:\Users\KANDERSO\projects\rdtools\docs\sphinx\source\generated\rdtools.normalization.pvwatts_dc_power.rst, C:\Users\KANDERSO\projects\rdtools\docs\sphinx\source\generated\rdtools.normalization.rst, C:\Users\KANDERSO\projects\rdtools\docs\sphinx\source\generated\rdtools.normalization.sapm_dc_power.rst, C:\Users\KANDERSO\projects\rdtools\docs\sphinx\source\generated\rdtools.normalization.t_step_nanoseconds.rst, C:\Users\KANDERSO\projects\rdtools\docs\sphinx\source\generated\rdtools.normalization.trapz_aggregate.rst, C:\Users\KANDERSO\projects\rdtools\docs\sphinx\source\generated\rdtools.soiling.rst, C:\Users\KANDERSO\projects\rdtools\docs\sphinx\source\generated\rdtools.soiling.soiling_srr.rst, C:\Users\KANDERSO\projects\rdtools\docs\sphinx\source\generated\rdtools.soiling.srr_analysis.rst
building [mo]: targets for 0 po files that are out of date
building [html]: targets for 4 source files that are out of date
updating environment: 33 added, 0 changed, 0 removed
reading sources... [100%] readme_link
looking for now-outdated files... none found
pickling environment... done
checking consistency... done
preparing documents... done
writing output... [100%] readme_link
generating indices... genindex py-modindex
writing additional pages... search
copying images... [100%] ../build/doctrees/nbsphinx/example_33_2.png
copying static files... done
copying extra files... done
dumping search index in English (code: en) ... done
dumping object inventory... done
build succeeded.

The HTML pages are in build\html.

If you get an error like Pandoc wasn't found, you can install it with conda:

conda install -c conda-forge pandoc

The built documentation should be in rdtools/docs/sphinx/build and opening index.html with a web browser will display it.

Contributing

Community participation is welcome! New contributions should be based on the development branch as the master branch is used only for releases.

RdTools follows the PEP 8 style guide. We recommend setting up your text editor to automatically highlight style violations because it's easy to miss some issues (trailing whitespace, etc) otherwise.

Additionally, our documentation is built in part from docstrings in the source code. These docstrings must be in NumpyDoc format to be rendered correctly in the documentation.

Finally, all code should be tested. Some older tests in RdTools use the unittest module, but new tests should all use pytest.

Indices and tables