Building a Korean Grammar Breakdown Tool with AI and Next.js

In passively studying Korean, I often come across sentences I don’t understand. That’s not surprising, my Korean level is decent beginner, and conversational Korean can build some complicated nested sentences. I often use the Papago translator to help me identify vocabulary, but it’s harder to suss out conjugated verbs & other grammar patterns. So I decided to build something that could help.

Screenshot with Korean phrase, English translation, and 3 grammar points used in the sentence.

This uses AI (ChatGPT 4.1 nano) to take a Korean input sentence and generate translations and relevant grammar constructions. I wanted to get more familiar with implementing an AI project & Next.js, and to build something with a clear end goal. I also didn’t want to build a chatbot, so this idea — one sentence in, one explanation out — was perfect.

Feasibility

Before I started anything, I needed to see how well the AIs would handle Korean to English. Google Translate is notoriously bad for Korean, so I wondered if AIs would have the same issue (they’re not all the same, but use similar methods, and are generally trained more on English data).

I asked Claude to explain some sentences, and it did surprisingly well. I went back and forth a few times, then asked it to give me the response in JSON. I had no idea if that would work, but it actually did. I gave the same prompt to ChatGPT, but liked Claude’s JSON format better. Both gave “good enough” answers.¹

Here’s an example prompt and response (don’t mind the example sentence; I used 10000recipe.com for a lot of my initial strings 😅).

Can you break down the Korean grammar and vocab in the following sentence? Return the reply as JSON.

대파를 먹어야 하는 10가지 이유

Claude’s response

{
  "sentence": "대파를 먹어야 하는 10가지 이유",
  "translation": "10 reasons why you should eat green onions",
  "breakdown": {
    "words": [
      {
        "korean": "대파",
        "romanization": "dae-pa",
        "part_of_speech": "noun",
        "meaning": "green onion, scallion, leek",
        "notes": "A type of Korean green onion that's thicker than regular scallions"
      },
      {
        "korean": "를",
        "romanization": "reul",
        "part_of_speech": "particle",
        "meaning": "object marking particle",
        "notes": "Marks 대파 as the direct object; used after words ending in consonants (를 after vowels would be 을)"
      },
      {
        "korean": "먹어야",
        "romanization": "meog-eo-ya",
        "part_of_speech": "verb + auxiliary",
        "meaning": "should eat, must eat",
        "notes": "먹다 (to eat) + 어야 하다 construction expressing obligation or necessity"
      },
      {
        "korean": "하는",
        "romanization": "ha-neun",
        "part_of_speech": "verb + modifier ending",
        "meaning": "doing (present participle)",
        "notes": "하다 (to do) + 는 (present tense modifier ending for verbs ending in vowels)"
      },
      {
        "korean": "10가지",
        "romanization": "yeol-ga-ji / sip-ga-ji",
        "part_of_speech": "number + counter",
        "meaning": "10 kinds/types",
        "notes": "10 (십/열) + 가지 (counter for kinds, types, varieties)"
      },
      {
        "korean": "이유",
        "romanization": "i-yu",
        "part_of_speech": "noun",
        "meaning": "reason, cause",
        "notes": "Sino-Korean word meaning reason or rationale"
      }
    ],
    "grammar_points": [
      {
        "structure": "을/를",
        "explanation": "Object marking particle - marks the direct object of the sentence",
        "example": "대파를 (green onions + object marker)"
      },
      {
        "structure": "verb stem + 어야/아야 하다",
        "explanation": "Expression of obligation, necessity, or strong recommendation ('should', 'must', 'have to')",
        "example": "먹어야 하다 (should eat, must eat)"
      },
      {
        "structure": "verb stem + 는",
        "explanation": "Present tense modifier ending used to modify nouns (creates relative clause)",
        "example": "하는 이유 (reasons that [one] does/has)"
      },
      {
        "structure": "number + 가지",
        "explanation": "Counter word for kinds, types, or varieties",
        "example": "10가지 (10 kinds/types)"
      }
    ],
    "sentence_structure": {
      "pattern": "[Object]을/를 [Verb]어야 하는 [Number]가지 [Noun]",
      "explanation": "This creates a noun phrase meaning 'X number of reasons why one should [verb] [object]'",
      "components": [
        "대파를 (object: green onions)",
        "먹어야 하는 (relative clause: that one should eat)",
        "10가지 (quantifier: 10 kinds)",
        "이유 (head noun: reasons)"
      ]
    }
  }
}

That gave me a good starting point for an API response syntax, though as I worked on the app, I realized it was a little overkill. The only things I was going to display were the word & meaning, so I could omit part of speech, notes, and romanization. For the grammar constructs, I wanted all three fields, so I left those in. My ideal API response looked like:

{
	sentence: string,
	translation: string,
	breakdown: {
		words: [{
			korean: string,
			meaning: string,
		}],
		grammar: [{
			structure: string,
			explanation: string,
			example: string (optional),
		}],
	}
}

Mockup of a screen: a sidebar with the title Sentence Analysis and a list of Korean sentences as links. The main content has a Korean sentence with each word highlighted and two tooltips with English, the full English translation, and then two grammar notes.

Simultaneously I mocked up what the UI would look like in Figma. I thought I might try mocking up the whole thing before building, but I didn’t have enough clarity about what the AI results would be. Instead, I stopped about here, but it did establish some requirements:

The sentence should be highlighted with English translations on hover
The sidebar should have a list of past sentences and saved responses
This is not a chatbot, so you enter a string and get a result, no further interaction on the result page

Implementation

At this point I have a good scope of work, and need to start actually building. Given this is an AI project, I decided to ask Copilot how to get started. It initially sent me to HuggingFace and wanted me to set up an AI server. I thought about it for a short time, thinking I could set up a Korean-trained model… but I didn’t want to go down that rabbit hole.

I came across the AI SDK, which is a library for interacting with different AI providers. It works with React & Next.js, and sounded perfect. I had already implemented the UI with some mocked up API responses, so now I just had to make the real request from the AI client. The docs for the AI SDK were focused on creating chat interfaces, but I was able to extrapolate a few things and (with Copilot’s help referencing docs) found generateObject. This let me define structured data for the AI response with a schema (created with Zod, a schema validation library).

This gives a prompt with the input sentence, asking for data using the provided object format. I had to tweak the prompt somewhat as I went. At one point, it started returning the grammar explanations in Korean. Sometimes the response was missing a field, which is invalid. Adding specific details into the prompt seemed to help.

generateObject({
	model: openai("gpt-4.1-nano"),
	schema: z.object({
		sentence: z.string(),
		translation: z.string(),
		breakdown: z.object({
			words: z.array(
				z.object({
					korean: z.string(),
					meaning: z.string(),
				}),
			),
			grammar: z.array(
				z.object({
					structure: z.string(),
					explanation: z.string(),
					example: z.string().optional(),
				}),
			),
		}),
	}),
	prompt: `Can you break down the Korean grammar and vocabulary in the following sentence? Identify and translate all of the words into English, include the exact phrase in the sentence. Identify some key grammatical constructions to know, and explain them in English.\n${input}`,
});

I added a very simple cache (a Map object) to prevent repeated OpenAI calls. I also had to handle a loading state and errors, which I did, but not in a lot of detail since this is just a personal project.

Checking responses

Once I had the API working, I started doing some edge-case testing: what if you enter phrases that aren’t Korean? In some cases, it would return a random basic sentence and explanation, in others it would reverse-translate, and describe the grammar it used in the result. This was interesting, but ultimately the app was not designed for this and it would show English translations on the English words.

Screenshot with English phrase, Korean translation, and 2 grammar points used in the sentence. The English phrase is highlighted, indicating an English translation.

I added a function to check that the input was in Korean before it sent off to the AI API. It checks what characters are in the input string, and as long as there was one Hangul character and the string was 75% valid characters (Hangul + numbers and punctuation), it would pass through. My goal was to avoid hitting the AI API as much as possible, so this check has a low tolerance for mixed-language input. It’s strict, but it works for my use case.

The result

I’ve published the code to GitHub, and you can try it by downloading or cloning the repo and following the two step process there. I haven’t deployed this anywhere public yet because I would want to limit my API key’s usage.² But I can share plenty of screenshots:

Three screenshots of the app: Homepage, a results page, and results page with a word translation showing.

Future Improvements

Of course, while I set this up with a specific scope, there’s still plenty that can be done, big and small.

Hide the full English translation so that I can try to puzzle it out with the words I do know.
Improve the handling of errors — sentences without punctuation are technically “different,” and so fail the check for whether we’re reviewing the same input sentence.
Test out a few different AI models — I picked OpenAI because I had an account and could get an API key, and 4.1 nano for speed and cost. But I would like to see if other models pick out different grammar patterns, or explain better.
Implement that reverse translation properly so that I could pass in a complicated English sentence and see how it works in Korean.
Add the little arrows back onto my Tooltip component & display them above the words 😅

Overall, I’m happy with what I was able to build over the course of a few days. This project hit all my original goals: I got experience with AI integration, became more comfortable with Next.js, and built something with a clear, practical purpose. Next.js was a great choice — the scaffolding made it easy to get started quickly, and the app router made creating both pages and API endpoints easy. The AI SDK was also impressive; I stuck with simple requests, but the streaming responses and chat helpers seem really impressive.

If you’re interested in the implementation details or want to try it yourself, check out the GitHub repo!

As I was more focused on building the tool, I didn’t spend that much time diving into which models would give the best results. ↩︎
Honestly, if you’re looking for an app that does this & not just a tech experiment, you’re better off with mirinae.io. ↩︎