Principal Data Scientist
About us…
We’re a global intelligent automation consultancy in a period of hyper growth. Founded in 2020 in the UK, TQA has raised $20m in Series A and expanded to global offices in the US, Argentina, Romania and the Philippines. We’ve got a team of 140 people across the globe of extremely talented individuals who love to solve problems together. We passionately believe that automation has the power to expedite business transformation, enhance enterprise value and deliver a positive impact on people’s lives.
We’re looking for a highly skilled Principal Data Scientist to play a pivotal role in maintaining, upgrading and scaling our AI-powered systems; integrating these models into business workflows to deliver transformative, real-world applications and helping our clients thrive. If you’re passionate about Machine Learning, and thrive in a fast-paced environment - we want to hear from you.
This role will involve a diverse range of responsibilities that may evolve over time, requiring flexibility and adaptability as the business and its projects grow and change, offering the opportunity to work across multiple areas of data science, cloud infrastructure, and machine learning.
About you…
6+ years Data Science consultancy experience (PhD + 3 years, Masters + 4 years, or Grad + 6 years - the degrees ideally would be Computer Science, Data Science or other STEM):
- 3+ years client-facing experience
- At least two projects delivered into production
- Strong experience conducting Exploratory Data Analyses in customers of multiple different industry types and different types of data. Structured and unstructured, image, natural language, etc
- Experience with supervised, semi-supervised and unsupervised ML
- Able to flex to project and client requirements, which may necessitate pragmatic compromise in order to deliver an outcome
- Knowledge of and use of publicly available accelerators to delivery, such as code repositories that can be used as a base, pretrained models etc.
- Able to use Generative AI models and workloads to accelerate delivery
- Experience with go-to-market offering creation and development, the presales and sales cycles and how to support them technically, and ongoing support of delivered projects
3+ years of experience in team leadership and management:
- Head of/Lead-level positions
- Must have had at least 2 direct reports (ie Line Manager) that were Data Scientists
- Involved in growing teams - experience in hiring, firing, growth areas etc
- Used to collaborating across geographies and timezones, adopting a ‘federated’-type approach - self-contained, but operating within and accountable for delivering against the overall global strategic priorities
- Comfortable with flexing work hours to fit client requirements, such as UK timezones
The ideal candidate will have…
Experience with carrying out Data Science and ML workloads on AWS. These would include:
- Amazon SageMaker (including training jobs, hyperparameter tuning jobs, inference endpoint deployment)
- ‘Core stack’ including Lambdas, S3, ECS etc.
- Data transformation technologies, including Glue/Athena (but not expecting full expertise)
- Strong knowledge of SageMaker Augmented AI and Ground Truth, leveraging its outputs to improve ML models
- Serverless ML design patterns for AWS, including using Lambdas and Step Functions to orchestrate SageMaker
Python open-source Data Science stack, including:
- Plotting: matplotlib, seaborn, plotly etc.
- Data manipulation: pandas, numpy, scipy etc.
- ML/Deep Learning: scikit-learn, pytorch etc.
- Generative AI: OpenAI API usage etc.
Communication skills:
- Able to present technical information to non-technical audiences
- Able to demonstrate art-of-the-possible with SMEs that don’t know what is possible with their data, to understand what data might be available (publicly or otherwise) that the client might not know or think to use
- Able to work with domain experts outside their skillset, which may include Solution Architects, Data Engineers, MLOps Engineers etc
- Able to creatively solve problems, both human and machine, to deliver the greatest change and strongest outcome given the true landscape for a project or series of projects
- Strong English language skills
Experience with taking on, maintaining, supporting and upgrading production codebases
- Experience responding to tickets logged against business-critical systems
- Able to triage issues, and attempt to solve, such as data quality issues affecting the output of a production ML workload
- Able to communicate with and collaborate with experts in other area, such as Data Engineering, Web Development and MLOps, to solve issues that might span across multiple areas
Experience with Databricks to deliver Data Science outputs, ML models and production-grade solutions
- Experience with multiple hyperscalers, especially Azure, and ideally Azure OpenAI
- Strong experience with production, such as understanding how to implement ML into end-user-facing workloads and embedding them in business processes effectively
- PoC to production experience with Generative AI, such as RAG deployment or GPT-X API integration into a user-oriented workflow
- Data Warehouse/Data Engineering technology exposure or experience, such as configuring Databricks from the outset for a new environment
Sounds good? Here’s an outline of the responsibilities.
Team cultivation (5-10%):
- Oversee the local team development (1:1s etc), hold country-wide DS Huddles to share knowledge and discuss the cutting edge
- Create and refine go-to-market materials and offerings, such as how we will best leverage the Databricks best practices for our clients and how this dovetails into our broader Data offerings (working with the Data Science Director and other Data principals)
Managed service response (5-10%):
- We have signed a Managed Services agreement with a major UK-based company. They have a break-fix with us, however a lot of the issues they report and identify actually stem from underlying data quality issues.
- We want to identify these proactively, but also respond to the client with triaged and catalogued issues
- You would be the main cover for the initial chasing to this identification/deidentification of being or not being a Data quality-related issue, such as a malformed field in a form being parsed as a string instead of an integer - this is simple but difficult to uncover without going through the codebase etc.
Research and development: (10-20%):
- You would be working closely with our local and international teams to create accelerators and other IP that will enhance our offerings to clients
- This involves staying at the state-of-the-art, such as reading and presenting the latest papers in journal clubs, as well as sharing more publicly with blogs and meetups
- TQA understands the need for this to be part of your role officially, and as such it is
Delivery (60-80%):
- For some current and new customers, you would run our AI Advisory and Data Science as a Service delivery models as the key expert DS. Initially, this will involve mostly a sole operation, but in time you will oversee teams of more junior colleagues who will support you across multiple projects.
Our offerings include:
- Customer-facing workshops, to identify and prioritise which use cases are most important and of highest value. To also create a plan of attack based on assumptions that will be later proved or disproved
- AI Strategy consultation, to identify where companies currently are along their AI/ML journey and is this in concert with what they believe, and to resolve discrepancies therein respectfully
- Exploratory Data Analysis, on each use case for 1-2 weeks full-time depending on the complexity of the data and of the task. We seek to validate the assumptions in the stages beforehand, and ultimately refine our plan for proof-of-concept - this is also the stage where we and the customer can hit a breakpoint if either do not believe the ROI calculation passes muster
- Proof-of-concept, taking the use case and demonstrating its value in principle, in 6-8 weeks. Usually, this will involve the use of some AI/ML and/or Generative AI, but we would also usually not create a new model/paradigm from scratch, instead using existing machinery and templates available, such as Amazon SageMaker built-in models
After these stages, we would work with but handover to a dedicated 4-8 person team to take the project down the part of production, starting with a Minimum Viable Product (4-8 weeks), and then eventually working with the client's team to get it into production and operationalise the usage of the solution. You would still be leading this, but from a thought-leadership perspective and ‘engagement lead-like’ role, ensuring that there is continuity in oversight from start to finish of the solution development, but also with the support of Data Engineers, MLOps Engineers, Cloud Architects, Solution Architects, BI Engineers etc who each specialise in their respective areas and can work together to build the strongest possible solution within the constraints of the project
- Working with Sales to create delivery models and offerings for a specific customer and specific project, however this still comes under Delivery as that is the main priority - Presales ought to be carried out in the guise of the other day flavours above but also within projects as continuance or upsell, ie as part of your Delivery role on the project
- Department
- Technology
- Locations
- Manila Office
- Remote status
- Hybrid Remote
About TQA
TQA helps organizations harness automation, AI, and data to revolutionize industries, achieve extraordinary results, and unlock human potential.
We partner with best-in-class technology providers to deliver unparalleled solutions, services, and experiences that exceed customer expectations.
Principal Data Scientist
Loading application form