LLM Ops & Evaluation

Location
Vilnius, Lithuania
Employment Type
Full-time
Part-time
Temporary
Fixed Term
Location Type
On-Site
Fully Remote
Hybrid
Team
Engineering
Marketing
Design
Customer Support
Finance
HR
Recruitment
Data
Overview
Application

Quick summary: We're looking for someone to own the quality of our AI employees - prompts, evaluations, datasets, and model selection. You'll build the systems that determine whether our AI helpers actually help. This is a foundational role: you'll design our evaluation framework, be our primary technical contact with LLM providers, and be responsible for improving LLM outputs across the app.

This role is for someone who's shipped LLM-powered products at scale, thinks in systems, and wants to define how AI quality works at a company scaling fast.

Why join Sintra?

We build AI employees for small businesses. Real helpers with personalities, not faceless chatbots. They handle the work that keeps owners up at night - answering customer emails, posting on social media, analyzing sales data. For business owners who've always worked alone, we're giving them their first team.

50,000+ businesses use Sintra because for the first time someone made AI actually useful for them. While Silicon Valley builds for tech companies, we're building for the florist who needs help with Instagram, the contractor drowning in invoices, the restaurant owner who can't keep up with reviews.

The timing matters. LLMs just got good enough to actually do the work - not just talk about it. We're at the moment where this becomes real infrastructure for millions of businesses.

We raised $17M in seed funding. Team of 50, based in Vilnius, shipping daily. We move fast, take ownership of what we build, and live by one principle - work is play.

Who we're looking for
  • 2+ years hands-on with LLMs in production
  • Has built evaluation systems, not just written prompts
  • Strong technical skill set - you'll automate evals and work closely with engineering
  • Systems thinker who can handle hundreds of use cases × user customization
  • Clear communicator - you'll be our point person with technical teams of leading AI labs

What you'll do
  • Own end-to-end quality of all AI outputs across the app
  • Design and build our evaluation framework - automated tests, human review loops, quality scoring
  • Create, version, and optimize prompts for every use case
  • Build and maintain test datasets that catch regressions before users do
  • Hire and lead a team of prompt engineers and eval specialists as we scale

Our hiring process
  1. Fill in the application form. If we see a fit, we'll reach out for an intro call.
  2. Complete a take-home task that mirrors real work you'd do here. Be prepared to explain what you did and why.
  3. Join us for a tech call. Meet the team, see if we're right for each other.
  4. Get an offer if it's a mutual fit.

We understand good people have options, that's why we move super fast. Life's too short for drawn-out hiring processes.

What we offer
  • Compensation & Equity
    Top-of-market salary in Vilnius plus meaningful equity, so that you own a part of what you build. Salary range for this role: €5,000-8,000/month depending on expertise and experience.
  • Seamless Relocation
    Relocation bonus and support to make your move to Vilnius smooth.

About
Responsibilities
What we expect from you
What you should expect from us
Apply Now
Full Name
Email
Phone number
LinkedIn URL
Resume (optional)
Drag files here
PDF, DOC, DOCX, TXT up to 10MB
Uploading...
fileuploaded.jpg
Upload failed. Max size for files is 10 MB.
Are you based in / willing to relocate to Vilnius?
What’s your earliest start date?
What are your compensation expectations (EUR/year)?
Why do you want to join Sintra specifically and what makes you a great fit?
What have you shipped that you’re most proud of?
Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.