And how it relates to Data Science
Eleven years ago this month, I founded my first company. My youngest child was 10 years old and in the fifth grade, and my two other children were in the 7 grade and 9 grade. There was a bit of breathing room in my schedule and more time in the day for a job. Since I still wanted freedom and flexibility, I started a “handywoman” business instead of taking an office job.
Fast forward to 2020 and I’m starting on my next adventure in life: searching for work as a data analyst/scientist during a pandemic. I find that the two seemingly dissimilar jobs have commonality despite the outward appearance of disparity. Here’s my top 5 list of “words of wisdom” that apply to data science and home improvement:
Don’t tile yourself into a corner
Tiling requires planning. You don’t want to finish a tile job, stand back to view the result and realize — I should have started tiling the backsplash on the left side of the sink instead of the right side. It’s wise to carefully and methodically plan out where to start a tile job, how it will progress row by row and column by column, what cuts will be necessary, where will the last tile be set and will the last tile be placed in an attractive manner?
Analyzing data requires planning, too. Again, it’s wise to map a course carefully and diligently. Make time to ask: what is my goal with this data? What am I trying to achieve? And how will I get there? Time and energy are wasted if one doesn’t think about the end product.
Cleaning (ugh!) goes a long way
A mainstay of my business was interior painting: walls, ceiling, and trim. Although it is very tempting to start rolling new paint on walls (the thrill of a new, fresh color!), there’s a great deal of preparation and cleaning to do first. Before painting, surfaces should be lightly sanded to remove dirt and to smooth the surface, lightly washed to remove dust from sanding and residual dirt, allowed to dry thoroughly, large sightly holes filled and primed, and create clean paint lines by taping (if your hands are not up to the task of being steady.) Correct preparation takes time and patience, and it’s not always fun since the results aren’t instant gratification and as dramatic as a new color. But thorough cleaning is vital in order to create a beautiful finished product.
Similarly, in data science, it can be tempting to jump into a data set and start plotting, graphing, visualizing, or modeling to see what’s going on. However, if there are NaNs, missing values, mistakes in the data, or improper data types, you’re not going to get too far. Time and patience are needed to thoroughly clean and prepare data so you can create a beautiful finished product.
Do your research on what is behind the wall
For one client I installed wall trim molding on their living room; it was something I had done several times before and felt confident in tackling. I put up some low-adhesive tape to show the homeowner how it would look, measured the amount of wall trim needed, purchased/primed/painted, and then installed. Voila!
In this instance, though, I didn’t do a good job of researching the pitfalls of the job by thinking about what was behind the wall. If I had done a thorough job of thinking about that particular wall and project I would have investigated what was above the living room on the second story (the en suite bathroom) and looked into the basement to see the water pipes running right through the living room. The client was very understanding when I punctured their water pipe while nailing in the molding. (The damaged pipe was repaired by a licensed plumber.)
The moral of the story that applies to data science is that every project is different even if they seem the same on the surface. It’s a worthwhile endeavor to take time to really look at the data and determine how it is unique from other data sets you’ve previously worked with. Are there different tools and libraries needed? Is there research that can be done to enhance the understanding of this project?
There’s more than one way to remove a screw
Once I was preparing to install a new closet system and the only thing standing in my way was a 1” by 1” by ½” piece of wood attached to the floor by one rusty screw. The first attempt to remove the screw — by hand with a screwdriver — was unsuccessful. Attempt number two was to use an adjustable wrench. That didn’t work either because the rust had locked the screw in place. The third attempt was to use a cordless power drill, which stripped the screw (a special skill of mine — overdoing it to the point of stripping a screw). Finally, the last attempt was to switch out the screw head on the drill for a drill bit and just demolish the screw. That did the trick, the little block was removed and the new closet system installed.
In data science, there’s usually more than one way to do something — it may not be efficient, elegant, or precise.
Take a break to get back on track
On one particularly bad day, as the small glass tiles were popping off the wall while I was grouting and I was losing my mind with panic, I decided to take a break. Usually, I would take a breather break — sit down and have a drink of water. On this bad day, I decided a dose of fresh air would work wonders. Getting outside helped calm me down and clear my mind, and was more beneficial for me to find a solution to the popping tiles.
Sometimes it’s possible to power through bad situations and think of solutions on the fly. It’s that “tough it out” mentality. However, there’s nothing weak about walking away for a break to regroup and ferret out a solution. In the popping-off-the-wall tiles, the break allowed me to come up with a unique solution to remedy the situation and finish the backsplash. With data analysis, when the mind is tired, making careless errors and not focusing, a break away from the data and computer can work wonders.