Research Engineer, Trust & Safety
San Francisco, CA, USA
Posted on Thursday, October 12, 2023
We are looking for research engineers to help build safety and oversight mechanisms for our AI systems. As a trust and safety research engineer, you will work to train models which detect harmful behaviors and help ensure user well-being. You will apply your technical skills to uphold our principles of safety, transparency, and oversight while enforcing our terms of service and acceptable use policies.
Anthropic is an AI safety and research company that’s working to build reliable, interpretable, and steerable AI systems. We want AI to be safe and beneficial for our customers and for society as a whole. Our interdisciplinary team has experience across ML, physics, policy, business and product.
- Build machine learning models to detect unwanted or anomalous behaviors from users and API partners, and integrate them into our production system
- Improve our automated detection and enforcement systems as needed
- Analyze user reports of inappropriate accounts and build machine learning models to detect similar instances proactively
- Surface abuse patterns to our research teams to harden models at the training stage
You may be a good fit if you:
- Have 4+ years of experience in a data scientist, research scientist, or research/ML engineering position, preferably with a focus on trust and safety.
- Have proficiency in SQL, Python, and data analysis/data mining tools.
- Have proficiency in building trust and safety AI/ML systems, such as behavioral classifiers or anomaly detection.
- Have strong communication skills and ability to explain complex technical concepts to non-technical stakeholders.
- Care about the societal impacts and long-term implications of your work.
Strong candidates may also:
- Have experience with machine learning frameworks like Scikit-Learn, Tensorflow, or Pytorch
- Have experience with full stack engineering to build internal tooling
- Have experience with high performance, large-scale ML systems
- Have experience with language modeling with transformers
- Have experience with reinforcement learning
- Have experience with large-scale ETL
- Anthropic has a collaborative engineering culture and provides competitive compensation. The expected salary range for this position is $300k - $450k USD.
Hybrid policy & visa sponsorship: Currently, we expect all staff to be in our office at least 25% of the time. We do sponsor visas! However, we aren't able to successfully sponsor visas for every role and every candidate; operations roles are especially difficult to support. But if we make you an offer, we will make every effort to get you into the United States or United Kingdom, and we retain an immigration lawyer to help with this.