of Birmingham Study of the factors affecting road roughness measurement using smartphones

: The measurement of road roughness is important for the management of economic road maintenance. Not only is it an indicator of road condition and ride quality, but it also is used to determine road use costs, including travel time, fuel consumption, and vehicle maintenance. Because of the importance of roughness for road asset management decision-making, road agencies spend considerable resources trying to measure road roughness in a repeatable and reproducible manner. However, many road agencies with large road networks are unable to record the condition of the entire network on a sufficiently frequent basis to determine adequately road condition to make informed preventative maintenance decisions. To address this, research has been carried out to develop low cost smartphone based technologies fitted inside vehicles to measure road condition. The trial of these systems has met with varying degrees of success. This paper presents an in-depth parametric study carried out using state-of-the-art vehicle dynamics software, informed by a review of the literature, to appreciate how and to what degree various influencing variables might affect roughness measurements using a smartphone fitted to a moving vehicle. These variables included the type and position of the smartphone; the type, speed, mass, dynamic response, suspension system, and tire pressure of the vehicle in which the smartphone is fitted; and the longitudinal road profile. The results of the parametric analysis were used to build multivariate linear regression and machine learning algorithms which predict road roughness from a measure of a vehicle ’ s vertical acceleration taking into account the predominant influencing variables. The multivariate linear regression equations can be used to predict road roughness with a similar degree of accuracy that is expected from a visual inspection. On the other hand, the machine learning algorithms, when suitably trained, were able to estimate reliably the road roughness on an integer-based rating scale at a level of detail which is suitable for strategic road asset management, provided that the vehicle type and speed and the type of smartphone are taken into account. DOI: 10.1061/(ASCE)


Introduction
The importance of maintaining roads at an appropriate standard to encourage economic development, minimize road use costs (i.e., travel time, fuel efficiency, vehicle repair costs, and accidents), provide social benefit, and reduce environmental impacts of transport is well documented (Robinson 2008). However, road agencies worldwide generally have insufficient maintenance budgets to treat their entire road networks, and therefore in order to make best use of scarce resources they are obliged to prioritize maintenance according to the perceived socioeconomic returns (Asphalt Industry Alliance 2018). To achieve this in a rational manner, the condition of the road networks must be assessed periodically using appropriate means at suitable frequencies, accuracies, and levels of detail. Ideally, data collection strategies associated with road surface condition should consider the functional performance of the road (e.g., roughness, surface cracking, and fretting), the structural condition (e.g., deflection and rutting), and measures of the road surface associated with the safety of the running surface (e.g., skid resistance and texture depth) (Paterson and Scullion 1990). Because the cost of data collection is considerable, data collection strategies for strategic road networks often are adopted whereby the functional condition, skid resistance, and texture depth of the network is measured in its entirety on an annual basis to identify problem areas rapidly. The structural condition, which is much more expensive to assess, is measured less frequently (McGhee 2004). The assessment of the condition of secondary, or local, roads is even less frequent, and often structural condition is ignored. On most road networks, road roughness usually is adopted as the measure of functional condition because it can be related readily to road use costs and its measurement can be automated. The most accurate automated methods of assessing road roughness use vehicles fitted with lasers to measure the road's longitudinal profile. The obtained road profile then is converted to road roughness using a standard mathematical procedure. Nevertheless, even assessing the roughness of a reasonably sized network can be costly. McGhee (2004) found that the cost of collecting road roughness data in the United States is between $1.4 ($2.23) and $6.2 per km ($10 per mile), depending on the state. For example, in Illinois, which has 224,719 km (139,577 mi) of roads, the cost of data collection is $1.4 million annually. An attractive solution for measuring the functional condition of large road networks economically is to make use of the acceleration sensors which are built into the majority of smartphones to assess road roughness by relating measured vertical acceleration to the road's longitudinal profile. Because smartphone ownership and use are widespread, an approach could be envisioned whereby the assessment of the condition of road networks is facilitated using crowdsourced data.
To investigate the feasibility of using smartphone technology for road roughness assessment, the following methodology was used: 1. The literature was reviewed to assess factors that could influence the measurement of vertical acceleration by a smartphone mounted in a moving vehicle. 2. The required accuracy and measurement frequency of the accelerometers in smartphones was analyzed, in relation to road condition data collection standards. 3. A vehicle dynamics package was used to examine the effect of vehicle-related and other external factors on the measurement of vehicle vertical acceleration. 4. Regression and machine learning models were developed which can predict road roughness from the vertical accelerations of vehicles. The data used to develop the models was obtained from running a large number of simulations using the vehicle dynamics package. The suitability of the predictive models was determined by comparing their outputs with the actual road roughness values of the road profiles.

Roughness and Its Measurement
The International Roughness Index (IRI) is the most commonly used measure of road roughness and is computed from a road's longitudinal profile using a mathematical procedure described by Sayers (1995). The mathematical procedure simulates a quarter-car model of a standardized vehicle travelling at a speed of 80 km=h over the road profile (ASTM 2015). IRI is the unit of slope (e.g., meters per kilometer) calculated from the accumulated suspension motions of the quarter-car model divided by the length of the road assessed. ASTM requires the IRI to be calculated over a minimum distance of 160 m (0.1 mi) (ASTM 2012). IRI ≤ 2.5 m=km is considered to represent roads in very good condition, 2.5 < IRI ≤ 3.5 m=km represents roads in good condition, 3.5 < IRI ≤ 6 m=km represents roads in fair condition, and 6 < IRI ≤ 10 m=km represents roads in poor condition (Archondo-Callao 2008). Commercially available devices for measuring road roughness are categorized into four levels according to their accuracy (Table 1) (Sayers et al. 1986). The most accurate of these devices (i.e., Class 1 and Class 2 devices) determine a road section's IRI by measuring the road section's profile, typically with a laser, and converting the profile to an IRI value using the procedure described by Sayers (1995). The World Bank introduced the concept of information quality levels (IQLs) as a guide to collecting data to inform road management decision-making (Paterson and Scullion 1990). Four IQLs are specified. IQL-1 and IQL-2 are associated with the highest level of accuracy and typically are used for research and project-level analyses, respectively. Information at IQL-3 and IQL-4 is at a coarser level of detail and is recommended for programming and strategic planning (i.e., network-level). By analogy with the information presented in Table 1, Class 1 devices could be used to collect IQL-1 information, Class 2 devices could be used to collect IQL-2-IQL-3 information and Class 3 devices could be used to collect

Use of Smartphones to Determine Road Roughness
Modern smartphones have in-built three-axis accelerometers and GPS capabilities. The measurement of road roughness utilizes the accelerometer to measure vertical vehicle-body acceleration. Discrete defects, such as potholes, can be recognized by identifying relatively large perturbations in continuous acceleration recordings. Road roughness can be determined by transforming the continuous vertical acceleration recordings and matching these with the GPS-determined locations . A variety of smartphone-based applications for measuring road condition have been developed using an empirical approach. Table 2 summaries nine commonly available applications. Despite the availability and use of approaches such as those listed in Table 2, their accuracy and degree to which they can measure roughness and identify other defects, such as potholes, in real time over large road networks is unclear. Furthermore, because the models are empirical, they may be valid only for a narrow range of vehicle classes. Moreover, empirical studies trialing such systems suggested that they may be regarded as Class 2 or 3 devices (Scholotjes, Visser and Bennett 2014;Chugh et al. 2014). This is partly because a large number of factors need to be taken into account when determining road roughness from acceleration readings determined from a smartphone. These factors are associated with the smartphones themselves, vehicle-related aspects, and the form of the road surface profile.

Characteristics of Smartphones
Various studies have shown that a number of characteristics of a smartphone may affect the measurement of vertical acceleration and therefore road roughness (Kos et al. 2016;Belzowski and Ekstrom 2015;Feng et al. 2015;Jones and Forslof 2014). The characteristics are associated with the operating range, resolution, frequency, and sensitivity to gravity and temperature. Accelerometers in smartphones were designed for gaming and typically have an operating range of AE2g and a resolution on the order of 0.002 m=s 2 (2 × 10 −4 g) (Del Rosario et al. 2015). Theoretically, these capabilities exceed the AE1g required to measure the road longitudinal profile from a moving vehicle because road vehicle body accelerations normally are between 0.07 and 0.7g (Katu et al. 2003). The high resolution of the accelerometers in a smartphone potentially allows the detection of small changes in measured vertical acceleration. However, in some smartphone models, issues with the acceleration sensors may causes random errors due to manufacturing problems such as the misalignment of the sensor (causing a bias error in measuring acceleration) and temperaturerelated sensitivity (Woodman 2007). In addition to ambient temperature, the number of processes running on a smartphone can increase a smartphone's internal temperature and in turn lead to errors ranging from 1 to 2.4 mg=°C when converting the physical accelerating force to an electric charge via the smartphone's microelectromechanical system (Kionix 2015). Kos et al. (2016) quantified the sensitivity to gravity of some common smartphone types. They found that for older smartphones (e.g., iPhone 4 and LG Nexus 5) the error in measuring gravity was between 0.0164 and 0.025g, but that more modern smartphones, such as the iPhone 6,

Potholes
Web-based crowd-sourcing to enable verifications by different drivers. A spike in acceleration exceeding a threshold is reported as a pothole. Byrne et al. (2013) Potholes Vehicle vertical accelerations are processed using a band pass filter of 0.5-6 Hz to identify and quantify the defects in terms of major and minor. The effects of low vehicle speeds (5 km=h), cornering, and accelerating/decelerating are eliminated. RoadLab by World Bank (Wang and Guo 2016) Road roughness Uses a regression model to determine IRI from vertical accelerations. Takes into account the effects of the position of the smartphone, vehicle speed and suspension type. Was found not to be an accurate predictor of the condition of unpaved roads (Workman et al. 2016). Roadroid (Forslöf 2012) Road roughness A multiple linear regression model is used to determine IRI from RMS of measured vertical accelerations. The model was developed using three vehicle types travelling at speeds between 20 and 100 km=h, different smartphones were considered. The accuracy of estimated IRI from the correlation was found to be 70%-80% compared with Class 1, considered as IQL-3/4. Measuring frequency of 100 Hz. Bump recorder (Koichi 2014) Discrete defects and road roughness The prototype application was developed using vertical acceleration data obtained from a smartphone fitted in a Toyota Prius. The application claims to determine both roughness and the height of discrete defects. The estimated vehicle unsprung elevation is assumed equal to the road profile. Measuring frequency of 100 Hz. Islam et al. (2014) Road roughness Uses a linear regression model to relate IRI to vertical acceleration. The model was developed from data captured from a smartphone inside a Honda CRV travelling at 50 mph. The system was found to have a high repeatability (coefficient of variance less than 15%). Vehicle sprung mass and suspension were found to affect the accuracy. Douangphachanh and Oneyama (2013) Road roughness An empirical model which relates IRI to vertical acceleration. The data to develop the model were obtained from studies undertaken with two types of smartphones fitted inside a Toyota VIGO 4WD pickup truck and a Toyota Camry. The speed of the vehicles was varied, as was the position of the smartphones inside the vehicles. Measurements were conducted at smartphone capture frequencies of 100 Hz. Du et al. (2014) Road roughness A multilinear regression model was developed between IRI and the power spectral density (PSD) of measured acceleration data. A Lexus sedan was used, travelling at speeds of up to 60 km=h. The estimated and actual IRI were compared and the error was found to be less than 15%. Belzowski and Ekstrom (2015) Road roughness Multiregression models based on empirical data obtained from nine different smartphones at sampling frequency of 100 Hz. Roads of five different IRI values were considered. Differences were found between smartphones.
had improved performance, with bias errors of less than 0.003g. The maximum speed at which a vehicle fitted with a smartphone can travel for the data collection system to comply with the requirements of the classes of devices specified in ASTM E950 (ASTM 2009) is governed by the sampling frequency (Table 3). This typically is between 40 and 400 Hz (ASTM 2009). Table 3 lists the corresponding maximum vehicle speeds for these frequency ranges.
The sampling frequency range of a smartphone when it is used to assess roughness should be limited to between 80 and 120 Hz (Jones and Forslof 2014). This is because at sampling frequencies lower than 80 Hz, the vehicle needs to travel at relatively slow and constant speeds to be in accordance with the requirements of Class 3 and better devices (Tables 2 and 3). At higher sampling frequencies, data storage and real-time processing can become problematic. Most of the applications identified in Table 2 were developed for a sampling frequency of 100 Hz. This suggests that in slow-moving traffic such as in urban environments where speed limits are less than 50 km=h, a smartphone used for roughness measurement would satisfy the requirements of Class 2/3 devices according to the ASTM E950 specifications (Table 3). This indicates that a smartphone should collect information at a sampling interval of not greater than 300 mm along the travelled distance.

Smartphone Mounting System and Position
A number of studies have shown that the mounting system and position of the smartphone in the vehicle can affect the recorded vertical acceleration values by AE15% (Kropáč and Múčka 2005;Belzowski and Ekstrom 2015). To avoid these issues, the smartphone should be mounted on the windshield with a suction cup and with a bracket that attaches to the dashboard to ensure a rigid mounting.
To quantify how the measurement of road roughness is influenced by these factors, a suite of numerical simulations was undertaken using the vehicle dynamics package CarSim version 2018.12. CarSim is an industry-standard vehicle dynamics simulator which is used extensively by vehicle manufactures and researchers (Mechanic Simulation 2017; So et al. 2014). CarSim is designed to replicate the real-world behavior of actual vehicle classes, and independent research has demonstrated that the vehicle responses simulated by CarSim closely replicate the actual responses of a variety of such vehicles, for example, the suspension and steering kinematics (Kinjawadekar et al. 2009) and vertical body acceleration (Organiscak 2014;Varunjikar et al. 2012). The simulations in the present work consisted of determining vehicle body vertical accelerations for a variety of combinations of vehicle types, speeds, accelerations, suspension stiffnesses, masses and damping ratios, tire pressures, and road profiles. For all simulations, the effect of the driver was included by adding 70 kg to the vehicle's sprung mass. Road profiles of 160-m straight sections of road with IRI values of between 1.0 and 7.3 m=km were used in the simulation, representing very good to poor road roughness conditions. The modeled profiles were those of actual roads, and were obtained from the Federal Highway Administration's (FHWA) Long-Term Pavement Performance (LTTP) data set (FHWA 2018). The IRI values of the road sections were computed directly from the road profiles using the standard procedure provided by Sayers (1995).
Research has shown that the power spectral density (PSD) of a vehicle's vertical acceleration, obtained from frequency domain analysis, correlates closely to road roughness (Hesami and McManus 2009;Sun 2001;Marcondes et al. 1992). According to Parseval's theorem, the RMS of the vehicle body vertical acceleration (Grms) in the time domain is equal to the square root of the integral of the PSD (Van Baren 2012; Rogers et al. 1997). Therefore, the simulated Grms values of vertical vehicle body accelerations obtained from CarSim were used to compare the results of the simulation with the IRI values of the road profiles. The value of Grms can be calculated as follows (Jang et al. 2016;Dawkins et al. 2011): where a z;i ¼ ith vehicle body vertical accelerations (g); N = number of acceleration readings over measured road section; and g = acceleration due to gravity (i.e., 9.81 ms −2 ).

Effect of Vehicle Type
The influence of vehicle type on the simulated Grms was assessed by carrying out simulations using 15 different vehicle types available in CarSim (Table 4). Comparisons were made for vehicles travelling at a constant speed of 50 km=h on four 160-m road sections of very good, good, fair, and poor condition (i.e., IRI values of 1.0, 3.1, 5.0, and 7.3 m=km, respectively). Fig. 2 plots the simulated Grms as a function of road condition and vehicle type. For all vehicle types, the simulated Grms increased with worsening road condition, and the difference in simulated Grms between vehicle types also increased with worsening road condition. Unsurprisingly, the large European van (LEV), European van (EV), and utility truck (UT) had the highest Grms responses. These responses were between 40% and 190% greater than those for passenger vehicles, depending on road condition. This reflects the relatively stiff suspension and heavy unsprung masses of vans and utility trucks commensurate with their requirement to carry cargo. Because the differences in simulated Grms for vans and trucks were so much greater than for all other vehicle types, it would be better to exclude these vehicle types from a system which considers data from a fleet of many different classes of vehicles, particularly on roads of relatively poor condition (IRI > 5 m=km). Alternatively, a fleet consisting of only these vehicle types could be used. The lowest Grms responses, for all four roads, occurred for D-class minivans (DMVs) and F-class sedans (FSDs). This is because they are relatively heavy vehicles and the suspension of a FSD, a luxury car, is relatively soft. Some vehicle types are relatively alike in terms of simulated Grms (i.e., within AE10%) and therefore will provide similar IRI values for the same road sections. These include hatchbacks and DSDs, D and E classes of sport utility vehicles (SUVs), and the two types of European vans. This is because these groups are of a comparable shape, chassis configuration, and weight, and therefore have similar wheel-hop natural frequencies (González et al. 2008). The acceleration data provided by a fleet consisting of such similar vehicle types fitted with comparable smartphones and travelling at similar speeds therefore could be used to provide estimates of roughness at IQL-3 and IQL-4 without needing to make adjustments for vehicle type.

Effect of Suspension and Tire Pressure
An examination was carried out to determine the effects of suspension stiffness, suspension damping, and tire pressure by comparing the responses of a DSD and a LEV. The vehicles were simulated to travel at a constant speed of 50 km=h on four 160-m road sections representative of good, fair, poor, and very poor conditions. For the analysis of suspension stiffness, the suspension spring rates of the vehicles' front and rear springs were varied by AE20%. The results  suggest that a decrease in suspension stiffness (i.e., improving ride comfort) reduced the simulated Grms values for both vehicle types (by approximately 0%-7% for the DSD and 7%-13% for the LEV), and an increase in stiffness increased the simulated Grms values for both vehicle types (1%-7% for the DSD and 8%-15% for the LEV), indicating reduced ride quality (Table 5). The percentage changes were found to be greatest for the roads in the worst condition. The effects of suspension stiffness on a LEV were found to be generally higher than those on a DSD because the relatively stiffer suspension of a LEV results in a higher simulated Grms. For the analysis of suspension damping, the suspension damping ratio (shock force versus compression rate) was varied by þ20% on both the front and rear dampers. A decrease in damping ratio reduced the simulated Grms values for both vehicle types considered, and the effect increased as road condition worsened (Table 5). For a 20% reduction in damping, the Grms decreased by 2%-7% for a DSD and by 3%-8% for a LEV. However, a 20% increase in damping ratio increased the simulated Grms by 3%-7% for a DSD and 0%-8% for a LEV. The effect increased as road condition worsened.
The influence of underinflation of tire pressure was simulated by changing the tire spring rate to 70% and 90% of their default values. The results of the analysis demonstrate that reducing the tire spring rate led to a reduction in the simulated Grms of between 2%  and 12% for a DSD and between 1% and 20% for a LEV, depending on the degree of change in tire spring rate and road condition ( Table 5). The highest reductions in Grms were associated with a lower tire spring rate and a worse road condition.

Effect of Sprung Mass
The addition of weight (e.g., a passenger) was simulated by changing the vehicle sprung mass. For the comparison, five vehicle types were modeled travelling at a speed of 50 km=h on a road in good condition (i.e., IRI ¼ 3.1 m=km). The results show that adding additional weight resulted in a reduction of the simulated Grms value (Fig. 3). Every 70 kg (i.e., approximately one additional passenger) resulted in approximately a 5% decrease in Grms irrespective of the vehicle type or road condition.

Driving Style
The influence of driving style on simulated vehicle body acceleration was investigated by simulating a DSD travelling on a road with IRI ¼ 5 m=km (i.e., an older road in fair condition) for the five different regimes (Table 6). Compared with driving at a constant speed, accelerating or decelerating affected the simulated Grms body acceleration by between 3% and 7%.

Road Profile
Road longitudinal profiles of different shapes can have the same roughness value. Fig. 4 shows an example in which two (real) very different road profiles of similar IRI values (i.e., 5.0 m=km, determined over a length of 160 m) have different PSD curves (Fig. 5). The frequency domain plot shows that Section A had lower amplitudes for wavelengths smaller than 0.5 m=cycle (e.g., potholes), whereas Section B had higher amplitudes in the vicinity of 20-m=cycle wavelengths (e.g., due to deformations or displacements which occur in the subgrade) (Fig. 5). To further investigate how this may affect the measurement of IRI using a smartphone, simulations were carried out using three vehicle types travelling at 30 and 50 km=h on four different road sections 160 m in length. Two of the road sections had an IRI of 5 m=km (A and B) and two had an IRI of 1.5 m=km (C and D). The results of the simulation are given in Table 7. For all simulations, the Grms values were different for roads with the same IRI value. The differences were greatest (approximately 25%) for road sections with low IRI values (i.e., 1.5 m=km). This suggests that it may be appropriate to introduce a bandpass filter to eliminate the effects of short wavelength features (e.g., cracks, microtexture, and potholes), especially when assessing roads with low IRI values. Table 8 classifies the influence of the aforementioned factors into three classes according to their degree of influence determined from the parametric study (i.e., high > 30%, moderate = 10%-30%, and low < 10%).

Development of Predictive Models
Mathematical models based on a multivariate linear regression analysis and machine learning were developed to predict road  roughness from vehicle body vertical acceleration. The purpose of developing the models was to identify the most suitable model forms and the influencing variables which need to be included in the development of similar predictive models, and to gauge the accuracy which might be expected from the models. The models were developed from a data set produced from 6,000 CarSim simulations of two vehicle types travelling over road sections with different road profiles. The accuracies of the developed models were determined by comparing the IRI values predicted by the models with the IRI values (i.e., target variable values) of the road profiles determined using the industry-standard approach used to determine IRI from data collected by devices which measure the road profile (Sayers 1995). For the simulation, a large European van and a D-class sedan were modeled and subject to the variables which were found to have a high or moderate impact on vertical vehicle body acceleration ( Table 8). The variables considered in the simulation are given in Table 9. The data obtained from the CarSim simulation were split into three separate data sets, one associated with the large European van only, one associated with the D-class sedan, and a third which combined the data from both data sets. The vertical acceleration data were converted to Grms values as discussed previously. Table 10 gives the results of the multivariate regression analysis; the data therein were used to develop a linear model for a large European van (Eq. 2), a linear model for a D-class sedan (Eq. 3), and a linear model which does not include the vehicle type (Eq. 4) where SE = standard error. The coefficient of multiple determination (R 2 ) of the three linear models was 0.83 for the LEV model, 0.79 for the DSD model, and 0.74 for the model which does not take into account vehicle type. Compared with the target variable values of IRI determined from the road profile using the standard method given by Sayers (1995), the standard error of the three linear models was 1.39, 1.54, and 1.69 m=km, respectively. Commonality analysis was carried out to determine how much variance in predicted IRI each of the six variables uniquely contribute (i.e., the significance of the variable). The results show that Grms contributes 82% of the variance in IRI in the case of the LEV and 79% for the DSD (Table 10). The Pearson's correlation also indicated strong correlation between Grms and the target IRI values (0.81 and 0.83 respectively). Thus, as is self-evident, the simulated Grms is the dominant factor influencing the prediction of IRI from the simulated data sets. Speed was found to have the second largest influence on the prediction of IRI, accounting for 14% of the variance in the case of the DSD and 10% for the LEV. The other influencing variables do not appear to significantly affect the prediction of IRI when using multivariate linear regression.  Accordingly, simplified regression models which predict IRI from Grms, speed, and vehicle type were developed as follows: For a large European van For a D-class sedan Ignoring vehicle type IRI ¼ 49.30 × Grms − 0.055 × Speed þ 3.73 The values of the R 2 and SE of the preceding sets of equations which consider all the influencing variables, and that which considers speed alone, suggest that multivariate linear regression could be used to determine IRI to a similar degree of accuracy as might be expected from a visual inspection (Table 1), i.e., IQL-4, even without taking into account the class of vehicle. However, considering all of the influencing variables does not seem to improve the results to an extent which would allow roughness to be measured to a higher IQL.
Three machine learning algorithms based on a decision regression tree, an artificial neural network (ANN), and a random forest were developed. The three machine learning algorithms were described by (Han et al. 2011). The three model types were chosen because their structures (i.e., dendritic and neuron based) have been shown to be applicable to large and complex data sets, yet they can be trained quickly using a relatively small sample of test data (without overfitting) to produce accurate predictions (Chandra et al. 2012;Melhem and Cheng 2003). Furthermore, the model types have been used successfully in related applications which predict road condition from field measurements (Jang et al. 2016;Seyfi et al. 2013;Nitsche et al. 2012;Soleimani and Sahebi 2012).
A classification and regression tree (CART) decision regression tree type, as advocated by Breiman et al. (1984), was selected with an unlimited tree depth and no pruning. The ANN utilized the RProp algorithm described by Riedmiller and Braun (1993) and consisted of two hidden layers with four neurons per layer. For each analysis the RProp algorithm was iterated 500 times. The random forest algorithm was of the form suggested by Breiman et al. (1984) and had two hundred trees with an unlimited tree depth.
The values chosen for the aforementioned parameters (number and depth of trees, number of layers, and so forth) for the three algorithms were determined on a trial-and-error basis to achieve a reasonable balance between model accuracy and the length of the required training time.
A 10-fold cross-validation process was used to train and test the machine learning algorithms. Such a validation process strongly increases the randomness (i.e., variance) in the data modeling and thereby reduces the likelihood of the model achieving a good accuracy by chance (Han et al. 2011). The validation process consisted of portioning the LEV, DSD, and combined data sets randomly into 10 equal-sized subsets. Each machine learning algorithm was trained using nine subsets, and the remaining subset was used for testing. This process was repeated a further nine times until all subsets were tested. The results of the 10 different trials were averaged to give an indication of the overall performance of each algorithm. Table 11 presents the results of this analysis for the three data sets. The three machine learning algorithms performed better than the regression approach described previously, as evidenced by higher R 2 and lower RMS error (RMSE) values. Furthermore, the neural network and random forest performed better for all three data sets than did the decision tree.
A feature elimination approach was used to remove each variable in turn in order to better understand the influence of the six selected variables on the prediction of IRI. The results of this process also are given in Table 11. The simulated Grms, as expected, was found to have the greatest influence on predicting IRI. Speed was found to be the second most important variable for all three data sets. For example, when speed was not taken into account, for the LEV data set the RMSE increased from 0.91 to 1.74 m=km and the R 2 value decreased from 0.92 to 0.72. In comparison, the four other variables had a relatively minor influence on the prediction of IRI. This supports the findings of the regression analysis described previously.
Comparing the performance of all three machine learning algorithms in analyzing the three data sets indicated that the ANN and random forest algorithms were better predictors of IRI than was the  Sayers (1995).   decision regression tree. Both algorithms performed better for the LEV and DSD data sets than for the combined data set. Removing the vehicle type variable from the analysis using the ANN and random forest algorithms increased the RMSE by between 3% and 29%. For both algorithms, the RMSE for the LEV data set was higher than for the DSD data set. This suggests that it is important to take into account vehicle type when predicting IRI. However, removing all variables that were found from the parametric study to have a medium influence on IRI prediction, i.e., number of people, stiffness, damper force, and tire pressure (Table 11), did not affect the error in the prediction. For example, for the neural network model when considering the LEV and combined data sets, the RMSE value increased by 27% (0.91-1.16 m=km) and by 17% (0.97-0.94), respectively.

Concluding Discussion
The suitability of using a smartphone suitably fixed inside a moving vehicle to determine road roughness (IRI) was assessed by reviewing existing approaches using a parametric study carried out with the aid of CarSim, a vehicle dynamics package. CarSim has been shown to accurately replicate actual vehicle behavior, and its use allowed a large number of controlled simulations to be carried out in a relatively short time.
It was found that an important aspect of measuring road roughness with a smartphone is to convert the smartphone's vertical acceleration data to RMS body accelerations (i.e., Grms). The simulation results showed that because Grms is strongly correlated with the actual road IRI in a real-world system, the calculation of the Grms of potentially millions of measured acceleration signals accruing from road users requires less computational processing than approaches that require the additional transformation of the data to the frequency domain.
The parametric study showed that vehicle speed and type and Grms (i.e., vertical vehicle body acceleration) had a large effect (>30%) on the assessment of IRI. Sprung mass, suspension stiffness, longitudinal profiles giving the same IRI, and tire pressure had a moderate effect (i.e., between 10% and 30%), whereas acceleration/braking and suspension damping had a relatively minor influence (<10%). Multivariate linear regression and machine learning algorithms were developed to predict IRI from the vertical acceleration (i.e., Grms) values obtained from the CarSim simulation. The results suggest that a multivariate linear regression which takes into account vehicle speed could be used to predict IRI to an accuracy equivalent to that which might be expected by visual observation (i.e., a Class 4 device at IQL-4). However, using a machine learning algorithm as part of a suitable smartphone-based system could enable road roughness to be predicted in accordance with the requirements of a Class 3 device and to an IQL of 3 (Table 1). For example, the ANN that was trialed was able to predict road roughness to a RMSE of between 0.73 and 0.91 m=km when vehicle class and speed were considered. Data at IQL-3, and at the frequency of collection possible with a smartphone, would enable long-term strategic road management decision-making. Such decision-making enabled by roughness data collected using such a smartphone system would allow for (Sayers et al. 1986) • The summarizing at low cost of the condition of the entire road network on a regular basis (e.g., annually). • The use of network level models that evaluate and compare maintenance policies, and assess road use and road agency costs. For example, IRI values at IQL-3 could be utilized within a decision support tool, such as the World Bank's standard for road investment appraisal, HDM-4, to enable economic road maintenance strategies to be identified (Odoki et al. 2013). • The primary screening of road sections to identify and prioritize road sections requiring maintenance and rehabilitation. In practice, road sections identified for treatment would be further assessed using project-level data (i.e., data to IQL-2) to determine the type of treatment required. To support such strategic road asset management, the routine inspection of the condition of a road network could be achieved using low-cost data collection systems which utilize smartphones with similar characteristics inside a fleet of vehicles of similar types, travelling at normal traffic speeds.
Without knowing the vehicle type and speed, vertical acceleration data obtained from smartphones could be analyzed using machine learning algorithms to enable IRI to be predicted to a similar accuracy as would be expected from a visual inspection, but with arguably improved repeatability and reproducibility. Such data could be suitable for the routine analysis of the condition of local road networks. A particularly useful application could be the assessment of the condition of low-volume rural road networks in developing countries where the majority of rural roads are constructed from either gravel or earth and where smartphone ownership is surprisingly high (World Bank Group 2016).
It is recognized that the work presented herein is based on simulation, and therefore the work has a number of possible limitations. An objective of the research was to compare the simulated vertical accelerations measured by a smartphone of a vehicle travelling on actual road sections, with the roughness of the road sections determined from a profile of the road using a standard method (Sayers 1995). The former was simulated using the CarSim vehicle dynamics package. The road roughness values of the road sections were computed using a standard procedure developed by Sayers (1995) which is used by the road data collection industry to determine IRI from data collected by devices which measure the road profile. CarSim and the method provided by Sayers (1995) use models of vehicle motion to determine the vertical vehicle body acceleration and the IRI, respectively, from the summation of model vertical displacements. CarSim replicates the behavior of actual vehicles, whereas the quarter-car model used by Sayers (1995) is a simplified model of a standardized vehicle. Although CarSim has been shown by independent studies to closely replicate the performance of real vehicles, the components of real vehicles that might affect vehicle vertical body acceleration will vary because of age, use, and maintenance (i.e., suspension stiffness, damping, tire pressure, and mass). The combined effects of these parameters on the vertical vehicle accelerations for a single vehicle class were not considered in this work. Furthermore, the work presented assumed that the simulated vertical vehicle body accelerations were equivalent to the accelerations measured by a smartphone fixed inside a vehicle. However, the simulation did not replicate the effect of varying the mounting system nor the position of the smartphone in a vehicle. These collectively can affect the recorded vertical acceleration values by AE15% (Kropáč and Múčka 2005;Belzowski and Ekstrom 2015). For a practical system, the smartphone should be adhered to the windshield with a suction cup and a bracket that attaches to the dashboard to ensure a rigid mounting, preventing the smartphone from moving freely.
The following conclusions can be drawn from the research: 1. Converting vehicle body acceleration to Grms greatly facilitates the prediction of road roughness; 2. Vehicle body acceleration, smartphone type, vehicle type, and speed were found to be the dominant factors that influence road roughness measurement; 3. A data collection system which utilizes smartphones suitably fixed inside a moving vehicle to assess road roughness can satisfy the frequency of data collection and accuracy requirements of a Class 3 device if vehicle speed is taken into account and a suitable data processing approach is adopted; 4. Without knowing vehicle type and speed, Grms data can be utilized to assess road roughness to a similar degree of accuracy as can be achieved by a visual inspection; and 5. The use of a bandpass filter to eliminate high frequencies should be considered to account for the effect of the shape of road profiles on the measurement of IRI using a smartphone, particularly for roads in good condition. These factors notwithstanding, although the algorithms presented for the analysis of vertical vehicle body acceleration demonstrated promising results for the artificial data sets developed for the research, clearly they need further testing using data obtained in the field, calibration, and refinement to the conditions at hand for their practical application.