Reading an EPW file starts with function read_epw(), which parses an EPW file and returns an Epw object. The parsing process is basically as [EnergyPlus/WeatherManager.cc] in EnergyPlus, with some simplifications.

Details

An EPW file can be divided into two parts, headers and weather data. The first eight lines of a standard EPW file are normally headers which contains data of location, design conditions, typical/extreme periods, ground temperatures, holidays/daylight savings, data periods and other comments. Epw class provides methods to directly extract those data. For details on the data structure of EPW file, please see "Chapter 2 - Weather Converter Program" in EnergyPlus "Auxiliary Programs" documentation. An online version can be found here.

There are about 35 variables in the core weather data. However, not all of them are used by EnergyPlus. Actually, despite of date and time columns, only 13 columns are used:

  1. dry bulb temperature

  2. dew point temperature

  3. relative humidity

  4. atmospheric pressure

  5. horizontal infrared radiation intensity from sky

  6. direct normal radiation

  7. diffuse horizontal radiation

  8. wind direction

  9. wind speed

  10. present weather observation

  11. present weather codes

  12. snow depth

  13. liquid precipitation depth

Note the hour column in the core weather data corresponds to the period from (Hour-1)th to (Hour)th. For instance, if the number of interval per hour is 1, hour of 1 on a certain day corresponds to the period between 00:00:01 to 01:00:00, Hour of 2 corresponds to the period between 01:00:01 to 02:00:00, and etc. Currently, in EnergyPlus the minute column is not used to determine currently sub-hour time. For instance, if the number of interval per hour is 2, there is no difference between two rows with following time columns (a) Hour 1, Minute 0; Hour 1, Minute 30 and (b) Hour 1, Minute 30; Hour 1, Minute 60. Only the number of rows count. When EnergyPlus reads the EPW file, both (a) and (b) represent the same time period: 00:00:00 - 00:30:00 and 00:30:00 - 01:00:00. Missing data on the weather file used can be summarized in the eplusout.err file, if DisplayWeatherMissingDataWarnings is turned on in Output:Diagnostics object. In EnergyPlus, missing data is shown only for fields that EnergyPlus will use. EnergyPlus will fill some missing data automatically during simulation. Likewise out of range values are counted for each occurrence and summarized. However, note that the out of range values will not be changed by EnergyPlus and could affect your simulation.

Epw class provides methods to easily extract and inspect those abnormal (missing and out of range) weather data and also to know what kind of actions that EnergyPlus will perform on those data.

EnergyPlus energy model calibration often uses actual measured weather data. In order to streamline the error-prone process of creating custom EPW file, Epw provides methods to direction add, replace the core weather data.

Usage

epw <- read_epw(path)
epw$location(city, state_province, country, data_source, wmo_number, latitude,
             longitude, time_zone, elevation)
epw$design_condition()
epw$typical_extreme_period()
epw$ground_temperature()
epw$holiday(leapyear, dst, holiday)
epw$comment1(comment)
epw$comment2(comment)
epw$num_period()
epw$interval()
epw$period(period, name, start_day_of_week)
epw$missing_code()
epw$initial_missing_value()
epw$range_exist()
epw$range_valid()
epw$fill_action()
epw$data(period = 1L, start_year = NULL, tz = "UTC", update = FALSE)
epw$abnormal_data(period = 1L, cols = NULL, keep_all = TRUE,
                  type = c("both", "missing", "out_of_range"))
epw$redundant_data()
epw$make_na(period = NULL, missing = FALSE, out_of_range = FALSE)
epw$fill_abnormal(period = NULL, missing = FALSE, out_of_range = FALSE, special = FALSE)
epw$add_unit()
epw$drop_unit()
epw$purge()
epw$add(data, realyear = FALSE, name = NULL, start_day_of_week = NULL, after = 0L, warning = TRUE)
epw$set(data, realyear = FALSE, name = NULL, start_day_of_week = NULL, period = 1L, warning = TRUE)
epw$del(period)
epw$clone(deep = TRUE)
epw$is_unsaved()
epw$save(path, overwrite = FALSE)
epw$print()
print(epw)

Read

epw <- read_epw(path)

Arguments

  • path: Path of an EnergyPlus EPW file.

Query and Modify Header

LOCATION Header

epw$location(city, state_province, country, data_source, wmo_number, latitude,
             longitude, time_zone, elevation)

$location() takes new values for LOCATION header fields and returns the parsed values of LOCATION header in a list format. If no input is given, current LOCATION header value is returned.

Arguments:

  • city: A string of city name recorded in the LOCATION header.

  • state_province: A string of state or province name recorded in the LOCATION header.

  • country: A string of country name recorded in the LOCATION header.

  • data_source: A string of data source recorded in the LOCATION header.

  • wmo_number: A string of WMO (World Meteorological Organization) number recorded in the LOCATION header.

  • latitude: A number of latitude recorded in the LOCATION header. North latitude is positive and south latitude is negative. Should in range [-90, +90].

  • longitude: A number of longitude recorded in the LOCATION header. East longitude is positive and west longitude is negative. Should in range [-180, +180].

  • time_zone: A number of time zone recorded in the LOCATION header. Usually presented as the offset hours from UTC time. Should in range [-12, +14].

  • elevation: A number of elevation recorded in the LOCATION header. Should in range [-1000, 9999.9).

DESIGN CONDITION Header

epw$design_condition()

$design_condition() returns the parsed values of DESIGN CONDITION header in a list format with 4 elements:

  • source: A string of source field

  • heating: A list, usually of length 16, of the heading design conditions

  • cooling: A list, usually of length 32, of the cooling design conditions

  • extreme: A list, usually of length 16, of the extreme design conditions

For the meaning of each element, please see ASHRAE Handbook of Fundamentals.

TYPICAL/EXTREME Header

epw$typical_extreme_period()

$typical_extreme_period() returns the parsed values of TYPICAL/EXTREME PERIOD header in a data.table format with 6 columns:

  • index: Integer type. The index of typical or extreme period record

  • name: Character type. The name of typical or extreme period record

  • type: Character type. The type of period. Possible value: typical and extreme

  • start_day: Date type with customized formatting. The start day of the period

  • start_day: Date type with customized formatting. The end day of the period

GROUND TEMPERATURE Header

epw$ground_temperature()

$ground_temperature() returns the parsed values of GROUND TEMPERATURE header in a data.table format with 7 columns:

  • index: Integer type. The index of ground temperature record

  • depth: Numeric type. The depth of the ground temperature is measured

  • month: Integer type. The month when the ground temperature is measured

  • soil_conductivity: Numeric type. The soil conductivity at measured depth

  • soil_density: Numeric type. The soil density at measured depth

  • soil_specific heat: Numeric type. The soil specific heat at measured depth

  • temperature: Numeric type. The measured group temperature

HOLIDAYS/DAYLIGHT SAVINGS Header

epw$holiday(leapyear, dst, holiday)

$holiday() takes new value for leap year indicator, daylight saving time and holiday specifications, set these new values and returns the parsed values of HOLIDAYS/DAYLIGHT SAVINGS header. If no input is given, current values of HOLIDAYS/DAYLIGHT SAVINGS header is returned. It returns a list of 3 elements:

  • leapyear: A single logical vector. TRUE means that the weather data contains leap year data

  • dst: A Date vector contains the start and end day of daylight saving time

  • holiday: A data.table contains 2 columns. If no holiday specified, an empty data.table

    • name: Name of the holiday

    • day: Date of the holiday

Validation process below is performed when changing the leapyear indicator:

  • If current record of leapyear is TRUE, but new input is FALSE, the modification is only conducted when all data periods do not cover Feb 29.

  • If current record of leapyear is FALSE, but new input is TRUE, the modification is only conducted when TMY data periods do not across Feb, e.g. [01/02, 02/28], [03/01, 12/31]; for AMY data, it is always OK.

The date specifications in dst and holiday should follow the rules of "Table 2.14: Weather File Date File Interpretation" in "AuxiliaryPrograms" documentation. eplusr is able to handle all those kinds of formats automatically. Basically, 5 formats are allowed:

  1. A single integer is interpreted as the Julian day of year. For example, 1, 2, 3 and 4 will be parsed and presented as 1st day, 2nd day, 3rd day and 4th day.

  2. A single number is interpreted as Month.Day. For example, 1.2 and 5.6 will be parsed and presented as Jan 02 and May 06.

  3. A string giving MonthName / DayNumber, DayNumber / MonthName, and MonthNumber / DayNumber. A year number can be also included. For example, "Jan/1", "05/Dec", "7/8", "02/10/2019", and "2019/04/05" will be parsed and presented as Jan 02, Dec 06, Jul 8, 2019-02-10 and 2019-04-15.

  4. A string giving number Weekday in Month. For example, "2 Sunday in Jan" will be parsed and presented as 2th Sunday in January.

  5. A string giving Last Weekday in Month. For example, "last Sunday in Dec" will be parsed and presented as Last Sunday in December.

For convenience, besides all the formats described above, dst and days in holiday also accept standard Dates input. They will be treated as the same way as No.3 format described above.

Arguments:

  • leapyear: Either TRUE or FALSE.

  • dst: A length 2 EPW date specifications identifying the start and end of daylight saving time. For example, c(3.10, 10.3).

  • holiday: a list or a data.frame containing two elements (columns) name and day where name are the holiday names and day are valid EPW date specifications. For example:

    list(name = c("New Year's Day", "Christmas Day"), day = c("1.1", "25 Dec"))
    

COMMENT1 and COMMENT2 Header

epw$comment1(comment)
epw$comment2(comment)

$comment1() and $comment2() both takes a single string of new comments and replaces the old comment with input one. If no input is given, current comment is returned.

Arguments:

  • comment: A string of new comments.

DATA PERIODS Header

epw$num_period()
epw$interval()
epw$period(period, name, start_day_of_week)

$num_period() returns a single positive integer of how many data periods current Epw contains.

$interval() returns a single positive integer of how many records of weather data exist in one hour.

$period() takes a data period index, a new period name and start day of week specification, and uses that input to replace the data period's name and start day of week. If no input is given, data periods in current Epw is returned.

$period() returns a data.table with 5 columns:

  • index: Integer type. The index of data period.

  • name: Character type. The name of data period.

  • start_day_of_week: Integer type. The start day of week of data period.

  • start_day: Date (EpwDate) type. The start day of data period.

  • end_day: Date (EpwDate) type. The end day of data period.

Arguments:

  • index: A positive integer vector identifying the data period indexes.

  • name: A character vector used as new names for specified data periods. Should have the same length as index.

  • start_day_of_week: A character vector or an integer vector used as the new start days of week of specified data periods. Should have the same length as index.

Weather Data Specifications

epw$missing_code()
epw$initial_missing_value()
epw$range_exist()
epw$range_valid()
epw$fill_action(type = c("missing", "out_of_range"))

$missing_code() returns a list of 29 elements containing the value used as missing value identifier for all weather data.

$initial_missing_value() returns a list of 16 elements containing the initial value used to replace missing values for corresponding weather data.

$range_exist() returns a list of 28 elements containing the range each numeric weather data should fall in. Any values out of this range are treated as missing.

$range_valid() returns a list of 28 elements containing the range each numeric weather data should fall in. Any values out of this range are treated as invalid.

$fill_action() returns a list containing actions that EnergyPlus and also $fill_abnormal() in Epw class will perform when certain abnormal data found for corresponding weather data. There are 3 types of actions in total:

  • do_nothing: All abnormal values are left as they are.

  • use_zero: All abnormal values are reset to zeros.

  • use_previous: The first abnormal values of variables will be set to the initial missing values. All after are set to previous valid one.

Arguments:

  • type: What abnormal type of actions to return. Should be one of "missing" and "out_of_range". Default: "missing".

Query Weather Data

epw$data(period = 1L, start_year = NULL, tz = "UTC", update = FALSE)
epw$abnormal_data(period = 1L, cols = NULL, keep_all = TRUE,
                  type = c("both", "missing", "out_of_range"))
epw$redundant_data()

$data() returns weather data of specific data period.

Usually, EPW file downloaded from EnergyPlus website contains TMY weather data. As years of weather data is not consecutive, it may be more convenient to align the year values to be consecutive, which will makes it possible to direct analyze and plot weather data. The start_year argument in $data() method can help to achieve this. However, randomly setting the year may result in a date time series that does not have the same start day of week as specified in the DATA PERIODS header. eplusr provides a simple solution for this. By setting year to NULL and align_wday to TRUE, eplusr will calculate a year value (from current year backwards) for each data period that compliance with the start day of week restriction.

Note that if current data period contains AMY data and start_year is given, a warning is given because the actual year values will be overwritten by input start_year. Also, an error is given if using input start_year introduces invalid date time. This may happen when weather data contains leap year but input start_year is not a leap year. An error will also be issued if applying specified time zone specified using tz introduces invalid date time.

$abnormal_data() returns abnormal data of specific data period. Basically, there are 2 types of abnormal data in Epw, i.e. missing values and out-of-range values. Sometimes, it may be useful to extract and inspect those data especially when inserting measured weather data. $abnormal_data() does this.

$redundant_data() returns weather data in Epw object that do not belong to any data period. This data can be further removed using $pruge() method described below.

For $abnormal_data() and $redundant_data(), a new column named line is created indicating the line numbers where abnormal data and redundant data occur in the actual EPW file.

Arguments:

  • period: A single positive integer identifying the data period index. Data periods information can be obtained using $period() described above.

  • start_year: A positive integer identifying the year of first date time in specified data period. If NULL, the values in the year column are used as years of datetime column. Default: NULL.

  • align_wday: Only applicable when start_year is NULL. If TRUE, a year value is automatically calculated for specified data period that compliance with the start day of week value specified in DATA PERIODS header.

  • tz: A valid time zone to be assigned to the datetime column. All valid time zone names can be obtained using OlsonNames(). Default:"UTC".

  • update: If TRUE, the year column are updated according to the newly created datetime column using start_year. If FALSE, original year data in the Epw object is kept. Default: FALSE.

  • cols: A character vector identifying what data columns, i.e. all columns except datetime, year, month, day, hour and minute, to search abnormal values. If NULL, all data columns are used. Default: NULL.

  • keep_all: If TRUE, all columns are returned. If FALSE, only line, datetime, year, month, day, hour and minute, together with columns specified in cols are returned. Default: TRUE

  • type: What abnormal type of data to return. Should be one of "all", "missing" and "out_of_range". Default: "all".

Modify Weather Data In-Place

epw$make_na(period = NULL, missing = FALSE, out_of_range = FALSE)
epw$fill_abnormal(period = NULL, missing = FALSE, out_of_range = FALSE, special = FALSE)
epw$add_unit()
epw$drop_unit()
epw$purge()

Note that all these 5 methods modify the weather data in-place, meaning that the returned data from $data() and $abnormal_data() may be different after calling these methods.

$make_na() converts specified abnormal data into NAs in specified data period. This makes it easier to find abnormal data directly using is.na() instead of using $missing_code().

$fill_abnormal() fills specified abnormal data using corresponding actions listed in $fill_action(). For what kinds of actions to be performed, please see $fill_action() method described above. Note that only if special is TRUE, special actions listed in $fill_action() is performed. If special is FALSE, all abnormal data, including both missing values and out-of-range values, are filled with corresponding missing codes.

$make_na() and $fill_abnormal() are reversible, i.e. $make_na() can be used to counteract the effects introduced by $fill_abnormal(), and vise a versa.

$add_unit() assigns units to numeric weather data using units::set_units() if applicable.

$drop_unit() removes all units of numeric weather data.

Similarly, $add_unit() and $drop_unit() are reversible, i.e. $add_unit() can be used to counteract the effects introduced by $drop_unit(), and vise a versa.

$purge() deletes weather data in Epw object that do not belong to any data period.

Arguments:

  • period: A positive integer vector identifying the data period indexes. Data periods information can be obtained using $period() described above. If NULL, all data periods are included. Default: NULL.

  • missing: If TRUE, missing values are included. Default: FALSE.

  • out_of_range: If TRUE, out-of-range values are included. Default: FALSE.

  • special: If TRUE, abnormal data are filled using corresponding actions listed $fill_action(). If FALSE, all abnormal data are fill with missing code described in $missing_code().

Set Weather Data

epw$add(data, realyear = FALSE, name = NULL, start_day_of_week = NULL, after = 0L, warning = TRUE)
epw$set(data, realyear = FALSE, name = NULL, start_day_of_week = NULL, period = 1L, warning = TRUE)
epw$del(period)

$add() adds a new data period into current Epw object at specified position.

$set() replaces existing data period using input new weather data.

The validity of input data is checked before adding or setting according to rules following:

  • Column datetime exists and has type of POSIXct. Note that time zone of input date time will be reset to UTC.

  • It assumes that input data is already sorted, i.e. no further sorting is made during validation. This is because when input data is TMY data, there is no way to properly sort input data rows only using datetime column.

  • Number of data records per hour should be consistent across input data.

  • Input number of data records per hour should be the same as existing data periods.

  • The date time of input data should not overlap with existing data periods.

  • Input data should have all 29 weather data columns with right types. The year, month, day, and minute column are not compulsory. They will be created according to values in the datetime column. Existing values will be overwritten.

$del(period) removes one specified data period.

Arguments:

  • data: A data.table of new weather data to add or set. Validation is performed according to rules described above.

  • realyear: Whether input data is AMY data. Default: FALSE.

  • name: A new string used as name of added or set data period. Should not be the same as existing data period names. If NULL, it is generated automatically in format Data, Data_1 and etc., based on existing data period names. Default: NULL

  • start_day_of_week: A single integer or character specifying start day of week of input data period. If NULL, Sunday is used for TMY data and the actual start day of week is used for AMY data. Default: NULL.

  • after: A single integer identifying the index of data period where input new data period to be inserted after. IF 0, input new data period will be the first data period. Default: 0.

  • period: A single integer identifying the index of data period to set.

  • warning: If TRUE, warnings are given if any missing data, out-of-range data are found. Default: TRUE.

Save

epw$is_unsaved()
epw$save(path, overwrite = FALSE, purge = FALSE)

$is_unsaved() returns TRUE if there are any modifications to the Epw object since it was saved or since it was created if not saved before.

$save() saves current Epw to an EPW file. Note that if missing values and out-of-range values are converted to NAs using $make_na(), they will be filled with corresponding missing codes during saving.

Arguments

  • path: A path where to save the weather file. If NULL, the path of the weather file itself is used. Default: NULL.

  • overwrite: Whether to overwrite the file if it already exists. Default is FALSE.

  • purge: Whether to remove redundant data when saving. Default: FALSE.

Clone

epw$clone(deep = TRUE)

$clone() copies and returns the cloned Epw object. Because Epw uses R6Class under the hook which has "modify-in-place" semantics, epw_2 <- epw_1 does not copy epw_1 at all but only create a new binding to epw_1. Modify epw_1 will also affect epw_2 as well, as these two are exactly the same thing underneath. In order to create a complete cloned copy, please use $clone(deep = TRUE).

Arguments

  • deep: Has to be TRUE if a complete cloned copy is desired. Default: TRUE.

Print

epw$print()
print(epw)

$print() prints the Epw object, including location, elevation, data source, WMO station, leap year indicator, interval and data periods.

Examples