Chapter 7 Conclusion
7.1 Main Takeaway
According to the exploratory data analysis and visualization from the previous sections, we can conclude that more car crashes occurred at 8 to 9 o’clock in the morning and 4 to 6 o’clock in the afternoon then the other hours, which are the rush hours. In terms of months, October, November, December and January during the winter season, more vehicle accidents are observed comparing to other months. Such results reminding us to be more careful when driving on road during the specified time periods.
More specifically, when looking deeper regarding the weather conditions, lighting conditions and road surface conditions of a crash, we observed that when driving on a unlighted dark-road, road surface with Muddy and Snow/Ice condition become more dangerous than the other time. While driving on a lighted dark-road, road surface with wet and flooded water become more dangerous. Vehicle drivers should take extra precautions when encountering such combination of lighting and road surface conditions. Nevertheless, it is important to note that most of the car crashes occurred under the daylight and on dry road surface which are “good” condition as one would define instead of bad conditions. Therefore, all drivers should not let their guard down when driving under ideal conditions, which the data reminds us that indeed became the most dangerous situations.
New York State, containing 62 counties and around 8.5 million of population, is a state having high population density. Moreover, New York city is the city having the highest population and population density in the US. Reflected by the data, counties in the Manhattan area including Queens, Nassau, Suffolk and Kings, have the most car crashes occurred comparing to other counties in NY. Measured by the severity of the crashes, these four counties also have the highest amount of fatal accidents occurred. Thus, people who would drive in the Manhattan area, specifically in the four counties that have high population density mentioned previously should be more careful.
Within the vehicle accidents we observed in the data, around 70% of them are collision with other motor vehicles, followed by collision with animals and other fixed objects. Thus, one should not only taking care of the physical environment around but also should pay more intention on other motor vehicles around when driver on road. Accidents of colliding with other vehicle will involve payoff of more people. Notice that for car crash occurred when traffic control devices were presented, most of collision with pedestrian, bicyclist and other motor vehicle occurred when traffic signal presented. On the other hand, most of collision with fixed objects including tree, guide rail and earth embankment/rock cut, and animals occurred in No Passing Zone. In general, drivers should pay a lot more intention on traffic devices if one presented because they are actually give us useful instruction avoid accidents and keeping us safe. Supported by the data, Violating the instructions given by traffic devices could lead to properties damages, injuries and even life threatening accidents.
All in all, people should be careful when driving under high traffic flow time periods, taking care of the overall traffic environment including weather, lighting, road surface conditions and other motor vehicles on roads, as well as paying attention to the traffic device. As what we mentioned at the beginning, none of us can completely protect ourselves against the odds of being in an auto accident, but taking the lesson from the data can help us all remain safer on the roads.
7.2 Limitation
Note that the dataset we used for this project contains data received from both the law enforcement and motorist crash reports filed with DMV. DMV is solely the collector of the data, which they do not affect the accuracy of what is reported to them. Since part of the data are coming from self-report files, there could exist some self-report bias within the data.
In addition, the data we used is the crash cases data. Indeed, there are three more datasets from DMV which are individual data, vehicle data and violation data for car crashes specifically for the same time period in NY State. However, all these four datasets have different amount of observation which we cannot link the data together based on each case one by one. Considering the consistency issues and the one we are using contains the largest amount of observations which provides the most information and data we need, we didn’t combine the other three datasets all in one. We believe more information relevant to motor vehicle accidents could be found among analysis on those three datasets.
7.3 Future Work
As we mentioned in the limitation part, the other three datasets provided by DMV is still valuable in some way. Further work for this project would include analysis on those three dataset individually, while combining generalize the results together. Through such work, a more comprehensive view about the motor vehicle accidents could be obtained.