- Thinking of Forming an Offshore Company? Carefully Consider the US Tax Aspects – US Tax Services
- Công thức tính đường chéo hình thoi chuẩn 100% [Bài tập minh họa]
- Best Vietnam tours and companies for any traveller | finder.com.au
- The General Insurance | Complaints | Better Business Bureau® Profile
- Lý thuyết lực Lo-Ren-Xơ – loigiaihay.com
New York City Restaurant Inspections
A Data Science & Analytics Research Project
Conducted and Completed By: Andrew and Evan Pitigoi
– Introduction –
About Us –
Hello! Our names are Andrew and Evan Pitigoi, the twins and authors of this research paper. For starters, we are 11th grade students out of Los Gatos, CA. We are both serial entrepreneurs who have a true passion for data driven insights in the business world. On top of this, we are aspiring data scientists who love to analyze the world around us. Our data science background includes machine learning and AI work within Python, various forms of analysis conducted through excel for our start-ups, and many hours of independent study. This project is a culmination of our programming and statistical knowledge thus far, as well as a way for us to learn even more about the analytical world. We are both very excited and honored to share our research and insights on the open data provided by New York and its restaurant inspection results.
The Project –
Our goal for this research project is to make sense of a complex dataset. This meaning: we will formulate a problem statement in which we can find the answers. The goal of our research is to answer a few questions: can we make recommendations for hungry customers, prioritizations for busy city inspectors, and warnings for New York restaurants. For example, a city inspector might ask, “what type of restaurant should we inspect first,’’ and “where might those restaurants be?” While a customer might ask, “what’s the best location to get clean food from?” We hope to find those answers through our analysis. We will conduct our research using both Python and Microsoft Excel; which will allow us to pull out specific pieces of insights through statistical analyses. We will now move on to explain the data source at hand and what we plan to do with it.
– The What and Why –
The Data Set –
The NYC Health Department, in collaboration with NYC Open Data, openly publishes its data on New York City Restaurant Inspection Results, on which we are conducting our analysis. This dataset contains every sustained or not yet adjudicated violation citation conducted up to three years prior to the most recent inspection of restaurants in New York City. All of the restaurants included in the data set are currently running or being put on hold (besides the small percentage that have been shut down permanently). The dataset also provides information pertaining to the type of cuisine the restaurant serves, the location of the restaurant, information regarding the inspection and grading, as well as other information about the restaurant itself.
The Grading –
The grading is simple, an A grade is equivalent to a score of 0–13, a B grade is a score of 14–27, and a C grade is a score of 27 and above. Restaurants all receive an initial inspection at least once a year, and if an A isn’t given, they must be re-inspected no less than 7 days later. If the restaurant receives a B or C upon re-inspection, then they are coined “Grade Pending”. On top of this, if any critical violations are left unresolved, then there is the possibility of closure after an additional inspection. If a restaurant is forcibly closed, then the next inspection (for reopening) will occur approximately three months later. The reopening inspection just determines if the restaurant can publicly run again, it is not gradable. If the restaurant passes the reopening inspection, the previously posted grade card is taken down and it is given a “Grade Pending” card until a re-inspection determines the next grade.
Why This Data Set?
The data New York publicly releases is a great step to increased transparency, but with close to half a million rows of data, the general public can’t interpret much of its results. That is why we intend to draw helpful insights for all parties involved: being the restaurants, customers, and inspectors. Now that you know why you are here, we will move on to give a more specific overview of the data at hand.
– Descriptive Analysis –
Before we can begin to analyze the more complicated aspects of our data, we wanted to start with some descriptive statistics that are unable to be found by looking at the data set.
The New York City Restaurant Inspection Results dataset contains 412,159 inspections. Each row in the data set is a single inspection and the columns include all of the pertinent information. The most relevant columns, the ones we will analyze, are Cuisine Type, Boro, Score, Grade, Violation Type, Critical Flag, and Action. The other columns are fairly irrelevant given they provide descriptions about longitude, latitude, etc… Put simply, there are 85 different cuisine types ranging from American to Creole, 5 boroughs (Manhattan, Queens, Bronx, Brooklyn, and Staten Island), and 106 different types of violations. On top of this, the only actions taken after inspections are citing the violations, closing the restaurant, or re-opening or re-closing the establishment. Each section below contains descriptive analysis for each important column within the data set.
On the next page a pie chart helps us visualize the inspection count for the top 10 most apparent cuisine types as a percent of the total inspections in the data set. For reference, there are 78,756 American related inspections and 12,647 Bakery Product inspections. While some regions are certainly more influenced by specific cultures, each borough has the same top 10 cuisines shown below. More specifically, one will still find about 20% of the restaurants, being American, no matter what Borough they are eating in.
Percentage of Cuisine Type in New York ⬇
Below is a visual of the 5 boroughs in New York. While Manhattan is the smallest area, it houses the most restaurants. Here are the restaurants housed in each borough: Manhattan has 10,600 restaurants, The Bronx has 2,400 restaurants, Brooklyn has 6,600 restaurants, Queens has 6,000 restaurants, and Staten Island has 1,000. Overall, there are about 27,000 restaurants in New York City.
The 5 Boroughs ⬇
There are 106 different types of violations recorded. The histogram below shows the top 10 violations with their according definitions. This can allow us to see where restaurants should improve and where the city should offer more resources to mitigate violations.
Total Number of Top 10 Violations in New York ⬇
Below is the Distribution of Scores⬇
The mean score of all inspections is 20, or a B. As you can see, there is an abnormal distribution, with its apex, or the mode in this case, being a score of 12. The distribution shows us that while most restaurants receive A’s, there are many non-passing scores (anything less than 13) that drag the average score down.
Pi-Chart of Grade Appearance Percentage ⬇
There are mostly A’s within the dataset; which clearly show that something is going right within the inspection system. Additionally, when looking at a box-plot of scores and grades it became apparent that many grades didn’t match their scores. We will go into more depth on this in the city recommendation section.
There are 215,040 critical flags and 189,690 non-critical flags. This means that 53% of inspections result in critical flags; while 47% result in non-critical flags. There is also a direct relationship between a higher score and more critical flags.
Histogram of Actions Taken After Inspection ⬇
– Customer Recommendation –
In this section you will find information and recommendations, along with its analysis, pertaining to any restaurant goers who plan to eat in New York City.
Where to Eat
When analyzing the number of restaurants within the 5 boroughs of New York City, it becomes very apparent where it is easiest to find food. Manhattan has close to 40% of the restaurants inspected in the last year. Brooklyn follows with 25%, then Queens with slightly less. The Bronx has just under 10% of the city’s restaurants by this count, and Staten Island has less than half that. On top of this, Manhattan is the smallest area — being just 22 square miles. This is why we recommend eating in Manhattan for ease of finding a restaurant and one of the many cuisine options. Additionally, be slightly more careful when picking a place to eat in Staten Island, as this boro receives the most critical flags upon inspection. On the opposite end, Brooklyn has the lowest percent of critical flags, so restaurants in that location might be a safer bet.
What to Eat
New York is said to be the number 1 location for food diversity in the world by Forbes.
Pi-Chart of The Most Prominent Food Types in New York ⬇
Outside of these popular options, we have some recommendations based on A grade appearance. 71% of Hotdog establishments receive A’s, 65% of Donut places receive A’s, and 59% of Hamburger restaurants receive A’s. These three cuisine types also have a 1% chance of scoring C’s. On the other hand, we warn you when picking a Moroccan, Chinese, Egyptian, or Cajun restaurant because these cuisine types receive the highest percentage of B’s and C’s.
When in New York we recommend that, in general, avoid fast food restaurants. For example, the biggest violation offenders by specific restaurants are Dunkin Donuts with the most, then Subway and Mcdonalds following close behind. These establishments are also the biggest offenders for rats and vermin violations. Even though we found that donut and hamburger places received the most A’s, these fast food options are not good choices. That’s why we advise the customer to pick a more local or non-fast food place when eating. The reliability of a clean establishment is much better than the reliability of speed, so eat at a spot like Dough Donuts or Burger Joint over Dunkin and Mickey D’s.
– City Recommendation –
In this section you will find recommendations for both the city inspectors and the people who manage the data sets in New York City. These recommendations will be supported with some of our analysis, as well as visuals alongside it.
Data Set Fixes
The current system for collecting the inspection data is by hand. While this may seem to be the most effective way for recording data, it leaves room for lots of error within the data set. The most apparent errors are in the grading and scoring section. For example, the grade defined as “G” occurred a few times due to a typo, and in many cases the score didn’t line up with the grade.
Below is a box-plot of the Grades and Their Scores⬇
There are many outliers and overlaps between each grade, but this shouldn’t be happening. We recommend that the city automates a certain grade output for a given score. On top of this, it might be a good idea to have the inspector only check boxes on violations, so that the system can output an exact score and grade with no room for accidental or purposeful error. Also, it should be required that every inspection has each piece of information from the columns filled out. In many cases, an inspection would be missing at least one data point.
Most of our initial analysis was to find out how we can help New York City inspectors prioritize their future inspections. The city states that it tries to inspect each restaurant at least once a year, but some restaurants are receiving special treatment by getting inspected every 3–5 months. As a starter, 40% of restaurants receive a re-inspection after failing its first inspection. We feel that the inspectors should then prioritize the inspection of restaurants that score worse and get more violations, so that way, the restaurants can get up to an A grade faster. Ultimately this inspection prioritization of the historically worse restaurants will allow for New York to be a more clean eating environment. We first looked at the boroughs to see which location was scoring worse. We found that each borough was relatively scoring the same.
Percentage of Grades per Borough⬇
Each borough has almost the exact same grade distribution across the board, as well as a matching critical flag to non-critical flag ratio. This means the city is doing a good job keeping inspections fair across all boroughs. It also means that restaurants, on average, don’t hold themselves to worse standards because of the Borough. We will acknowledge that The Bronx has 3% less A’s than the other Boroughs, but this isn’t a dramatic enough difference to connect lower quality food to this region. On a similar note, Staten Island gets the least C’s, but there are less inspections and it’s not dramatically different.
Box-plot of Cuisine Types & Their Scores⬇
Each cuisine type is represented by a box plot above. While the chart is very busy, we boxed in the cuisine types that are most important to notice and highlighted the average score for each. The reason we highlighted these cuisine types is because their interquartile range is more diverse and the mean is higher. These areas are where we feel inspectors should focus more of their efforts. More specifically, we feel that inspecting African, Bangladesh, and Chinese restaurants(all with an average score of 24), as well as Cajun and Filipino restaurants(both with an average score of 23) more frequently could help lower the average score and shrink the range in scoring. Overall, more frequent inspections can help lower the scores of restaurants sooner and it will allow those same restaurants to generally improve for public safety.
– Restaurant Owner Recommendation –
By analyzing the data we were also able to find some recommendations for restaurants.
What to Watch out For
When looking at critical violations (ones that will fail a restaurant’s inspection) we wanted to help restaurants see what violations to avoid. Additionally, we would also like to warn restaurants of the top non-critical violations.
Here are the top violations per flag type⬇
We recommend that restaurants should make sure their temperature, storage, and washing are all up to code and in control before an inspector walks in. Additionally, we would also like to warn restaurants of being inspected in the second half of the year. When determining the number of violations from rodents, flies, roaches, and vermin in each half of the year, we found that in the second half of the year, or hotter period, there is a much greater percentage of these violations. That is why we urge restaurants to bring out the fly swatters and set up traps before a summer inspection.
– Conclusion –
The New York City Restaurant inspection Results data set might seem obscure at first, but through analysis we were able to make sense of it. We found some pretty useful insights for you(the reader), as well as some great take-aways for us as data analysts.
- We hope you enjoyed reading our paper, and we also hope you were able to learn something new from our insights. Here is a quick recap of some of our findings:
- There are 27,000 restaurants in NYC, and 10,6 00 of them are in Manhattan; which makes it an easy place to find food.
- Across all 5 boroughs the grading is fair with equal ratios between A’s, B’s & C’s; meaning you shouldn’t avoid any borough when looking for clean/safe food.
- NYC is the most diverse place for food in the world with about 89 different cuisines, but the most common cuisines are American, Chinese, Coffee, Pizza, and Mexican.
- Close to 80% of restaurants are graded an A, so most of the time the food is safe in NY.
- The biggest violation offenders are surface area issues, rodents, and vermin.
- The average score in the data set is around 19, or a B; while the mode of the data set is 12, or an A. The amount of low scores pulling the average down are mostly from initial inspections, but after re-inspection most restaurants are able to receive an A.
- Hamburgers, Hotdogs, and Donuts are the highest scoring restaurants, so these are some safe bets and also popular food choices.
- On the other hand, be more careful when eating Moroccan, Chinese, Egyptian, or Cajun because these cuisine types receive the most B’s and C’s.
- In general, avoid fast food establishments like Dunkin Donuts, Subway, and McDonalds because these restaurants are not only the largest violation receivers, but the biggest offenders of vermin, rats, and bug related violations. Just pick a local spot instead.
- While the city is doing a good job keeping inspections fair by borough, they need to focus more re-inspection efforts in African, Bangladesh, Chinese, Cajun, and Filipino restaurants. These restaurants have some of the highest average scores, and re-inspecting sooner or more frequently could help these establishments resolve their errors faster.
- The data set should also be automated to output a specific score and grade per selected violation to ensure more accurate data.
- If you own a restaurant, make sure to have appropriate temperature, surfaces, and storage before inspections. Also be aware, if being inspected in the second half of the year, of the greater amounts of vermin, rodents, and bugs.
This experience was very helpful for us. It not only allowed us to utilize the analytical skills we have developed thus far, but in the process we had to learn more in order to achieve our goals. This is only the beginning of our journey to becoming a data scientist, and we can’t wait to do more projects like this in the future. We also want to thank our mentor, Khatereh Khodavirdi, for helping us along the journey. She has been so amazing to work with, and we couldn’t have done this project without her. Anyways, thank you for your time, and we hope you enjoyed your time spent reading our paper.
Thank You For Reading :
New York City Restaurant Inspections
A Data Science & Analytics Research Project
Evan Pitigoi — firstname.lastname@example.org
Andrew Pitigoi — email@example.com