Enhancing Analytics: Move from Excel to Python
Good afternoon from London :wave::wave:
Good afternoon from London :wave:
I am a Data Scientist with 2 years of professional experience in Machine Learning, Data analysis and visualization, possessing a wide range of technical skills in programming, data science, cloud computing, and project management.
Keywords: Visual Analytics
, Statistical Data Analytics
, Data Management
Technical Skills:
- Programming: Python, R, SQL, MATLAB
- Data Science: Pandas, Numpy, NLTK, Scikit-learn, Matplotlib, Seaborn, Plotly, Tableau, QGIS, Google Analytics
- Cloud Computing: Databricks, GCP, Azure + Data Lake
- Others: Git, Trello, LaTeX, MS Office tools, Jira
- Interests and Learning: Software development (Agile), Open-source contribution
Professional demeanor: The word cloud of positive feedback I’ve received from colleagues
After completing my postgraduate studies in 2021, I worked for further academic achievements and early career experiences such as writting a paper, assisting a textbook publication, and taking on the role of a lead instructor to teach programming. Since I transitioned into public healthcare, I have been involved in several projects spanning various NHS organizations and Trust sites. That gaves me positive evaluations during my lastest appraisal.
The table below provides a convenient comparison, illustraing how my master’s assessments and work projects distribute emphasis among different skills and tasks related to data processing
, programming
, review
, and report/visualization
.
Data processing | Programming | Review | Report/Viz | |
---|---|---|---|---|
MSc Data Science | - | 70% | 20% | 10% |
Project A | 50% | 30% | 10% | 10% |
Project B | 95% | 5% | - | - |
Project C | 5% | 65% | 30% | - |
Average(ABC) | 50% | ~30% | <15% | <5% |
Whilst programming
can include data processing
, I primarily used robust datasets during my academic year. For the other cases, I found that loading the semi-structured data such as JSON (obtained from APIs) or unstructured data like images or WhatsApp text required more programming foundations for data processing. That’s why I choose to mark the full credits in solo programming during the master's programme.
At work, I sometimes had a strategic decision not to handle the review
or reporting part
of that specific project as stated in the table. This choice was agreed within the team. When other priorities required my immediate attention, I was able to pass all my model ideas and findings (with my thanks) to the other contributors who were working on the same project. That is certainly different from how I used to carry my own responsibilities for the university courseworks; the team needs me for team-work!
These differences highlight how an individual’s tech skill requirements to meet task commitments vary in terms of their collaborative level.
It happens to me, especially the more if working in the wider group project settings where multiple data collection points and interactions with diverse feedback loops are more frequent occurrences. For example, the project A aims to produce a distributional guide of human resource (medical professionals) for NHS England regions based on multiple factors such as demography, morbidity, deprivation, and demand projections. Following list of datasets are the required reference sources for the project A:
Each datasets would demand delicate works on regional mappings of the national data to the NHS’s 42 Integrated Care Boards (ICBs) or their sub-locations. Additionally, a certain specific types of information such as the count number of the In-Patient & Out-Patient episodes would limit the bottom level data collection. Therefore, given these complexities, it explains that how the considerable amount of the data processing necessitates becomes evident.
… to be continued on Youtube
Good afternoon from London :wave::wave:
Good afternoon from London :wave: