Building AI-Powered Features with Google Gemini API

My experience integrating Gemini 2.5 Pro into MatchMyResume and MedScan AI — prompt engineering, streaming responses, and handling multimodal inputs.

Suyog Bhise
Full Stack Developer · Pune, India

Why Gemini Over OpenAI

For both MatchMyResume and MedScan AI, I chose Google's Gemini API over OpenAI's GPT. The main reasons: Gemini's multimodal capabilities (especially for PDF/image analysis in MedScan), better free tier limits for prototyping, and the Google AI Studio playground made iteration fast.

MatchMyResume: Structured Output

The core feature was comparing a resume against a job description and returning a structured analysis. Getting consistent JSON output from an LLM requires careful prompt engineering:

const prompt = `
You are a professional resume reviewer. Analyze the resume against the job description.

Return ONLY valid JSON in this exact structure, no markdown, no explanation:
{
  "matchScore": number (0-100),
  "matchedSkills": string[],
  "missingSkills": string[],
  "suggestions": string[],
  "summary": string
}

Resume:
${resumeText}

Job Description:
${jobDescription}
`;

const result = await model.generateContent(prompt);
const text = result.response.text();
const clean = text.replace(/```json|\n```/g, '').trim();
const parsed = JSON.parse(clean);

MedScan AI: Multimodal Input

MedScan needed to read medical lab reports — often scanned PDFs or photos. Gemini's vision capabilities handled this natively:

const imagePart = {
  inlineData: {
    data: base64Image,
    mimeType: 'image/jpeg',
  },
};

const result = await model.generateContent([
  imagePart,
  'Analyze this medical lab report. Explain each value in plain English. Flag anything outside normal range.',
]);

Rate Limits & Error Handling

The free tier has rate limits. For production, always implement exponential backoff and show meaningful loading states to users — AI responses can take 3–8 seconds.

Key Lessons

Back to all posts