Person peeking from behind laptop
Jinyoung
Dev

Creating an Automatic Video Subtitle Translator (English ➡️ Korean)

Creating an Automatic Video Subtitle Translator (English ➡️ Korean)
0 views
24 min read

Recently, I've been watching a few YouTube videos by Andrej Karpathy.

Andrej Karpathy is one of the world's most renowned researchers in the fields of artificial intelligence and deep learning. I've heard his name many times before, but recently, he's popped up in my YouTube recommendations, so I've watched a few of his videos.

There's one reason why I happened to watch Andrej Karpathy's videos and decided to keep an eye on his content. He explains things really well. It's not so much that he's fluent, but rather, because he's achieved so much in the field and is still active as a leading researcher, he explains things in a very friendly and easy-to-understand way for people outside the field.

The video I'm going to watch this time is "Deep Dive into LLMs like ChatGPT," which was released on February 6, 2025. As of March 10, 2025, it's a very hot video with over 1.6 million views, and it's been mentioned by many people on Facebook and other tech blogs.

This video has a high view count and is often discussed in other LLM-related communities, so I've been thinking about watching it for a while. But at the same time, I couldn't ignore the language barrier.
It's not impossible for me to watch and understand English videos in real-time, but it's also true that my understanding is much lower than in my native language, especially with technical words and expressions, and videos about fields I'm not familiar with.

So, I decided to just create a video subtitle translator. I started with a vague idea: download the YouTube video, create subtitles, translate the subtitles, and then add the translated subtitles back to the video.

This article covers the process of creating subtitles by recognizing the speech in the video, creating subtitles in a different language, and adding them to the video. Although some code is included, reading the code is not important, and the focus is on how the final program was completed using Cursor (LLM).


LLM chat - Getting a feel for it

I started by using Cursor's Chat feature. The prompt was:

Let's write TypeScript code that satisfies the following requirements.

  • Read a video, translate the language spoken by the speaker in the video (English) into Korean, and create a new video with Korean subtitles.

First, let's discuss the components of the code and how it should be structured to meet the above requirements.

The LLM responded as follows:

# Required Main Components

1.  Video Processing
  - Library to read and process input video (FFmpeg)
  - In TypeScript, libraries like fluent-ffmpeg can be used.

2.  Speech Recognition (Speech-to-Text)
  - Extract audio from video and convert it to text.
  - Options: Google Cloud Speech-to-Text, AWS Transcribe, OpenAI Whisper API, etc.

3.  Translation API
  - Translate English text to Korean.
  - Options: Google Translate API, DeepL API, Papago API, etc.

4.  Subtitle Generation and Video Composition
  - Generate subtitle files (SRT/VTT) with timestamps from translated text.
  - Composite subtitles onto the video to create the final output.

# Technology Stack and Libraries to Use

1.  Basic Environment:
  - Node.js + TypeScript
  - npm or yarn package manager

2.  Video Processing:
  - fluent-ffmpeg: A library that allows you to use FFmpeg in Node.js.
  - @ffmpeg-installer/ffmpeg: Automatically installs FFmpeg binaries.

3.  Speech Recognition and Translation:
  - OpenAI Whisper API: High-quality speech recognition.
  - Google Cloud Translate or DeepL API: Translation.

4.  Subtitle Generation and Processing:
  - subtitle or subtitles-parser: Create and process SRT/VTT files.
  - node-srt: Create SRT files.

5.  Utilities:
  - dotenv: Environment variable management.
  - axios: HTTP requests.

# Proposed Project Structure

project/
├── src/
│ ├── index.ts # Main application entry point
│ ├── videoProcessor.ts # Video processing related functions
│ ├── speechRecognizer.ts # Speech recognition related functions
│ ├── translator.ts # Translation related functions
│ ├── subtitleGenerator.ts # Subtitle generation related functions
│ └── types/ # Type definitions
├── package.json
├── tsconfig.json
└── .env # Environment variables such as API keys

# Processing Flow

1.  Extract audio from the input video.
2.  Perform speech recognition (STT) using the audio, obtaining English text and timestamps.
3.  Translate the recognized English text into Korean.
4.  Generate a subtitle file (SRT) using the translated Korean text and timestamps.
5.  Combine the original video and subtitle file to create the final video.

What do you think of this structure? Are there any other requirements or considerations before we start writing code?

It's probably time to stop being surprised when I get this quality of answer, but I still can't help but be amazed.


Warm-up Feature Development

Before starting full-fledged feature development, I wanted to create a very, very simple feature first.

Because I couldn't 100% trust the above output from the LLM. Sometimes it said that a non-existent library existed, and sometimes it used a library that existed but for a different purpose than intended.

So, I entered the following prompt for now:

Let's start with a very simple code. Cut out the beginning of the video (about 10 seconds) and add the Korean subtitle '안녕하세요' (Hello) to the video.

Based on the first conversation and 👆 this prompt, the LLM responds as follows:

1. Project Initialization

First, let's create a project directory and initialize it.

mkdir -p simple-video-subtitle && cd simple-video-subtitle && npm init -y

2. Install Required Packages

Let's install the necessary packages. We'll set up FFmpeg and the TypeScript environment for video processing.

npm install fluent-ffmpeg @ffmpeg-installer/ffmpeg typescript ts-node @types/node @types/fluent-ffmpeg

3. Create TypeScript Configuration File & Project Structure

npx tsc --init
mkdir -p src

4. Modify package.json Scripts

Let's update the package.json scripts:

  "npm": "10.9.2"
  },
  "scripts": {
    "test": "echo \"Error: no test specified\" && exit 1",
    "start": "ts-node src/index.ts"
  },
  "keywords": [],
  "author": "",

5. Write Source Code

First, let's create the main file and write the code to cut out the first 10 seconds of the video and add the subtitle '안녕하세요':

Check the full source code
index.ts
import ffmpeg from 'fluent-ffmpeg';
import { path as ffmpegPath } from '@ffmpeg-installer/ffmpeg';
import * as fs from 'fs';
import * as path from 'path';

// Set FFmpeg path
ffmpeg.setFfmpegPath(ffmpegPath);

// Function to create directories
function ensureDirectoryExists(directory: string): void {
  if (!fs.existsSync(directory)) {
    fs.mkdirSync(directory, { recursive: true });
  }
}

// Function to create SRT subtitle file
function createSimpleSubtitle(outputPath: string): string {
  const srtContent = `1
00:00:00,000 --> 00:00:10,000
안녕하세요`;

  const srtPath = path.join(outputPath, 'subtitle.srt');
  fs.writeFileSync(srtPath, srtContent);
  return srtPath;
}

// Main function
async function main() {
  // Set input and output paths
  const inputDir = path.join(__dirname, "..", 'input');
  const outputDir = path.join(__dirname, "..", 'output');

  // Check directories
  ensureDirectoryExists(inputDir);
  ensureDirectoryExists(outputDir);

  // Input video file path (user must put the video file in this path)
  const inputVideoPath = path.join(inputDir, 'input.mp4');

  // Output video file path
  const outputVideoPath = path.join(outputDir, "output_with_subtitle.mp4");

  // Check if input file exists
  if (!fs.existsSync(inputVideoPath)) {
    console.error(`Input video file not found: ${inputVideoPath}`);
    console.log("Please add the input.mp4 file to the input folder.");
    return;
  }

  // Create subtitle file
  const subtitlePath = createSimpleSubtitle(outputDir);

  console.log("Video processing started...");

  return new Promise<void>((resolve, reject) => {
    ffmpeg(inputVideoPath)
      .setDuration(10) // Cut only the first 10 seconds
      .videoFilters(`subtitles=${subtitlePath}`) // Specify filter as a string
      .output(outputVideoPath)
      .on("end", () => {
        console.log(`Processing complete! Output file: ${outputVideoPath}`);
        resolve();
      })
      .on("error", (err) => {
        console.error("Error occurred:", err);
        reject(err);
      })
      .run();
  });
}

// Run the program
main().catch((err) => {
  console.error("Error occurred while running the program:", err);
});

In the case of the source code, there was one error in the middle, but even this was recognized and corrected by itself.

Based on the final source code, I tested it with a 1-minute video as input, and it worked very well. I could confirm two things here:

  1. The program plan established based on the requirements is quite reasonable, and
  2. There are no problems with the code generated so far or the libraries it references.

I was particularly encouraged by the fact that it ran cleanly without any errors. Now it's time to start building the automatic video subtitle translator in earnest.


Beginning Full-fledged Feature Implementation

I wanted the index.ts file to serve only as the entry point of the program, and the main logic to be in a file called translate_video.ts. I started with the following prompt:

Very good!

Now, let's write the program we originally intended. Create a file called translate_video.ts under the src directory and write the code in this file.

1. Add Dependencies

Yes, let's implement the functionality to recognize the speech in the video, translate it, and add subtitles. To do this, we need the following additional packages. First, let's install the necessary packages:

npm install @google-cloud/speech openai @google-cloud/translate dotenv

2. Write translate_video.ts Code

Now, let's create the translate_video.ts file and write the code:

Check the full source code
import ffmpeg from "fluent-ffmpeg";
import { path as ffmpegPath } from "@ffmpeg-installer/ffmpeg";
import * as fs from "fs";
import * as path from "path";
import * as speech from "@google-cloud/speech";
import { OpenAI } from "openai";
import { v2 as translate } from "@google-cloud/translate";
import * as dotenv from "dotenv";

// Load environment variables
dotenv.config();

// Set FFmpeg path
ffmpeg.setFfmpegPath(ffmpegPath);

// Initialize API clients
const speechClient = new speech.SpeechClient();
const openai = new OpenAI({ apiKey: process.env.OPENAI_API_KEY });
const translateClient = new translate.Translate({
  key: process.env.GOOGLE_TRANSLATE_API_KEY,
});

interface SubtitleSegment {
  start: number;
  end: number;
  text: string;
}

// Function to create directories
function ensureDirectoryExists(directory: string): void {
  if (!fs.existsSync(directory)) {
    fs.mkdirSync(directory, { recursive: true });
  }
}

// Function to extract audio
async function extractAudio(
  inputPath: string,
  outputPath: string
): Promise<void> {
  return new Promise((resolve, reject) => {
    ffmpeg(inputPath)
      .toFormat("wav")
      .outputOptions("-acodec pcm_s16le")
      .outputOptions("-ac 1")
      .outputOptions("-ar 16000")
      .save(outputPath)
      .on("end", () => resolve())
      .on("error", reject);
  });
}

// Function for speech recognition
async function transcribeAudio(audioPath: string): Promise<SubtitleSegment[]> {
  const audioBytes = fs.readFileSync(audioPath).toString("base64");

  const audio = {
    content: audioBytes,
  };
  const config = {
    encoding: "LINEAR16",
    sampleRateHertz: 16000,
    languageCode: "en-US",
    enableWordTimeOffsets: true,
  };
  const request = {
    audio: audio,
    config: config,
  };

  try {
    const [response] = await speechClient.recognize(request);
    const segments: SubtitleSegment[] = [];

    response.results?.forEach((result) => {
      result.alternatives?.[0].words?.forEach((wordInfo) => {
        const start = Number(wordInfo.startTime?.seconds || 0);
        const end = Number(wordInfo.endTime?.seconds || 0);
        const text = wordInfo.word || "";

        if (
          segments.length > 0 &&
          segments[segments.length - 1].end === start
        ) {
          // Connect to the previous segment
          segments[segments.length - 1].text += " " + text;
          segments[segments.length - 1].end = end;
        } else {
          // Create a new segment
          segments.push({ start, end, text });
        }
      });
    });

    return segments;
  } catch (error) {
    console.error("Speech recognition error:", error);
    throw error;
  }
}

// Text translation function
async function translateText(text: string): Promise<string> {
  try {
    const [translation] = await translateClient.translate(text, {
      from: "en",
      to: "ko",
    });
    return translation;
  } catch (error) {
    console.error("Translation error:", error);
    throw error;
  }
}

// SRT file creation function
function createSRTFile(segments: SubtitleSegment[], outputPath: string): void {
  let srtContent = "";
  segments.forEach((segment, index) => {
    const startTime = formatSRTTime(segment.start);
    const endTime = formatSRTTime(segment.end);

    srtContent += `${index + 1}\n`;
    srtContent += `${startTime} --> ${endTime}\n`;
    srtContent += `${segment.text}\n\n`;
  });

  fs.writeFileSync(outputPath, srtContent);
}

// SRT time format function
function formatSRTTime(seconds: number): string {
  const pad = (num: number): string => num.toString().padStart(2, "0");
  const hours = Math.floor(seconds / 3600);
  const minutes = Math.floor((seconds % 3600) / 60);
  const secs = Math.floor(seconds % 60);
  const ms = Math.floor((seconds % 1) * 1000);

  return `${pad(hours)}:${pad(minutes)}:${pad(secs)},${ms
    .toString()
    .padStart(3, "0")}`;
}

// Main processing function
export async function translateVideo(
  inputVideoPath: string,
  outputVideoPath: string
): Promise<void> {
  try {
    // Set working directory
    const workDir = path.join(path.dirname(inputVideoPath), "temp");
    ensureDirectoryExists(workDir);

    // Extract audio
    console.log("Extracting audio...");
    const audioPath = path.join(workDir, "audio.wav");
    await extractAudio(inputVideoPath, audioPath);

    // Speech recognition
    console.log("Speech recognition...");
    const segments = await transcribeAudio(audioPath);

    // Translation
    console.log("Translating text...");
    const translatedSegments: SubtitleSegment[] = [];
    for (const segment of segments) {
      const translatedText = await translateText(segment.text);
      translatedSegments.push({
        ...segment,
        text: translatedText,
      });
    }

    // Create subtitle file
    console.log("Creating subtitle file...");
    const subtitlePath = path.join(workDir, "subtitle.srt");
    createSRTFile(translatedSegments, subtitlePath);

    // Subtitle composition
    console.log("Composing subtitles...");
    return new Promise((resolve, reject) => {
      ffmpeg(inputVideoPath)
        .videoFilters(`subtitles=${subtitlePath}`)
        .output(outputVideoPath)
        .on("end", () => {
          // Clean up temporary files
          fs.rmSync(workDir, { recursive: true, force: true });
          console.log("Processing complete!");
          resolve();
        })
        .on("error", (err: Error) => {
          console.error("Error occurred:", err);
          reject(err);
        })
        .run();
    });
  } catch (error) {
    console.error("Error occurred during video processing:", error);
    throw error;
  }
}

Finally, let's create a .env file for environment variable settings:

# Google Cloud API key settings
GOOGLE_APPLICATION_CREDENTIALS="path/to/your/google-credentials.json"
GOOGLE_TRANSLATE_API_KEY="your-google-translate-api-key"

# OpenAI API key settings
OPENAI_API_KEY="your-openai-api-key"

3. Code Explanation

Now the structure and functionality of the project are complete! The main functions are as follows:

1.  Audio Extraction: Extracts audio from the video using FFmpeg.
2.  Speech Recognition: Converts English speech to text using the Google Cloud Speech-to-Text API.
3.  Translation: Translates English text to Korean using the Google Cloud Translate API.
4.  Subtitle Generation: Creates an SRT-formatted subtitle file from the translated text.
5.  Subtitle Composition: Combines the subtitles with the original video using FFmpeg.

To use it, you need to follow these steps:

1.  Set the required API keys in the .env file:
  - Google Cloud service account key file path
  - Google Translate API key
  - OpenAI API key (for future expansion)

2.  Put the video file to be processed in the input folder with the name input.mp4.
3.  Run it with the following command:
    `npm start`

Running the Code

It took a few more conversations to get to a version of the source code with all the compile errors fixed, but the code was ready very quickly anyway.

So I ran the code right away. And then I found three problems.

  1. Authentication Issues

    • This program uses the Google Cloud Speech-to-Text API to separate text from speech and the Google Cloud Translate API for text translation.
    • The version initially written by the LLM required a lot of complicated key settings to authenticate with the Google Cloud API.
  2. API Usage Issue (1)

    • In the version of the code written by the LLM, the Speech-to-Text API call was only supported for audio less than 1 minute long.
    • So, if you specify a video that is tens of minutes long as input, an error log is printed and the program dies in the middle.
  3. API Usage Issue (2)

    • When calling the Speech-to-Text API, if the audio file is large, it cannot be sent directly and must be delivered via GCS (Google Cloud Storage).
    • This issue also causes the program to stop in the middle of execution.

Solving Authentication Issues

I accessed the Google Cloud console to get an authentication key, and fiddled with the project settings to get the key. I had experience using GCP for a company project, so I thought this part would be easy to solve.

But I was reluctant to issue and manage separate keys for authentication. I also wanted to log in via the CLI and have the authentication information applied to the code immediately, rather than putting the key settings in the code for authentication. Since this was something I already knew, I modified the code myself.

For more information on GCP API authentication, see this document.

Solving API Usage Issues

Since the error log related to using the Speech-to-Text API was printed in the console window, I selected that part and added it to the Chat interface.

The LLM immediately recognized the cause of the error and automatically corrected the code. It sequentially solved the two API usage issues I identified earlier.

In particular, it recognized that I needed to use GCS, which I hadn't realized at first, automatically installed the GCS library, used the library to upload the audio file to Storage, and set the URI of the uploaded file in the input parameters of the Speech-to-Text API.


Second Code Execution

Now I feel like the code is getting close to completion.

I put in a short 5-minute video as input and ran the program. The program execution log was printed, and the process of separating the audio from the video, extracting the text from the audio, translating the extracted text into Korean, and finally adding the translated Korean subtitles to the video was all performed without error logs!

With a trembling heart, I ran the output file (video).

first result

Two problems immediately came to my attention.

  • The word ChatGPT was translated as Chachi PT. This seems to be a problem with the process of converting speech to text. The speaker definitely pronounced it ChatGPT.
  • The subtitles are displayed on the screen for too long. This is because the text extracted from the speech was converted directly into subtitle data.

I decided to fix the problem of subtitles being displayed for too long. So I added the following prompt. I entered the image (the same image above) into the chat interface to provide more context to the LLM.

The final output video subtitles are strange.

The subtitles are strange because they are processed as a single subtitle for too long, as shown below. Split the subtitles to a length that people can read comfortably.

The LLM responded as follows:

Yes, I will split the subtitles to make them easier to read.
It is usually recommended to split subtitles into lengths of 1-3 seconds.
I will add logic to split the segments into smaller units when processing the speech recognition results.

It wrote a new function called splitSegments and automatically modified the program to call this function in the original logic to mechanically trim the subtitle length.


Third Code Execution

Now let's run the third program execution.
It's amazing how quickly I can create a somewhat working program.

However, there was some disappointment in the third code run.

3rd wrong...

Almost at the beginning of the video, the speaker says this sentence.
it is obviously magical and amazing in some respects, it's a really good at some things, not very good at other things and there's also a lot of sharp edges to be aware of.

In Korean, it can be translated as
그것은 분명히 마법처럼 놀라운 면이 있으며, 어떤 면에서는 정말 뛰어나지만, 다른 면에서는 그다지 좋지 않고, 조심해야 할 날카로운 부분도 많다.

The problem is that the transcript between 17 and 20 seconds of the video is set to 'respects, it's a really good at some things, not very' because the text extracted from the speech is mechanically cut into seconds. So this sentence alone was translated from English to Korean, and the final subtitle started with 존경합니다. (respects => I respect you) 🙃

Now that I've identified exactly what the problem is, it's time to write another prompt. This time, instead of an image, I specified the transcription.json file, which is the result of extracting text from the speech, as input.

@translate_video.ts @transcription.json

This file is the result of speech -> text conversion. If you look closely, you can see that a sentence is interrupted in the middle.

  1. how do I make this video for a while it
  2. is a comprehensive but generation audience intoduction

As a result, the final translation quality is not good. Let's improve the speech -> text conversion logic so that a sentence is not interrupted in the middle.

The LLM responded as follows:

Yes, as you said, the translation quality will be degraded if a sentence is interrupted in the middle.
I will modify the speech recognition results to be reconstructed into sentence units.

It wrote a new function called groupWordsIntoSentences to replace the original function that mechanically trimmed sentences into seconds.

Another amazing thing is that it modified the Speech-to-Text API to turn on the 'Enable automatic punctuation' setting. The enableAutomaticPunctuation setting is an option in the Google Cloud Speech-to-Text API that determines whether to automatically add punctuation (e.g., periods, commas, question marks, etc.). For example:

  • Voice Input: "Hello how are you I'm fine thank you"
  • Output: "Hello, how are you? I'm fine, thank you."

Fourth Code Execution

It feels like we are getting more and more reasonable in processing sentence data.

I hope the new function groupWordsIntoSentences and the newly set API parameters will work well.

Now it's time to run the fourth code.

4rd 1st image
4rd 2nd image

When I watched the video up to 5 minutes, there was no problem in understanding the content. Some of the word processing was awkward, and some of the sentences seemed to be translated too literally, but overall, I think it's pretty good.

It took about 30 minutes from the time I entered the first prompt to the fourth code execution. I, who had no knowledge of video processing, created a usable automatic video translator program with just a few prompts.


Retrospect

Now that the program development is complete, let's have a retrospect!

Cost

Let's also analyze the cost.

GCP billing
  • Google Cloud Speech API
    • 975 won
  • Google Cloud Storage
    • 0 won
  • Google Translate
    • 6,936 won
    • The first 500,000 characters per month are free, so the full amount is discounted and the actual amount is 0 won.

This amount includes the cost of calling the API during the development process, and the entire process of processing the original video, which is over 3 hours long.

However, there seems to be a slight problem with the logic, and the cost was excessive. In particular, the cost of Google Translate seems to be a bit excessive. This is something that can be improved in the future.

Comparison of Translation Quality Results (vs YouTube Automatic Translation)

When watching a video on YouTube, you can translate the video's subtitles in real time. Let's compare the quality of this real-time translation with the translation by the automatic translator we created this time.

Youtube vs translator - 1st
  • The top one is with YouTube's real-time subtitle translation turned on. It translates the word Cloud as 구름 (cloud). On the other hand, the subtitles from the automatic translator below are fine.
Youtube vs translator - 2nd
  • Similarly, the YouTube real-time translation on top is a bit awkward. It seems to translate sentences in small chunks because it's a real-time translation. The automatic translator below recognizes and translates sentences to some extent.

Overall, the quality of the automatic translator is better than YouTube's real-time subtitle translation. However, I didn't watch the entire video and compare everything, but randomly selected certain time zones of the video to compare.

Final Thoughts

  1. The Initiative of Problem Recognition

    • I recognized that the subtitles were too long and covered the screen, and I asked the LLM for a solution. And then I recognized that the final subtitle quality was not good because the sentences were broken and translated. So I wrote a prompt specifying that that part needed to be fixed.
    • Both of these processes were done in a way that I recognized the problem and told the LLM how to solve it.
    • This made me think, "Can the LLM do this on its own?" Not only the solution to the problem, but also the problem itself is recognized by the LLM.
    • LLMs are very good at recognizing problems in source code (especially syntax errors). This is because the Linter or IDE outputs syntax error logs in real time, and the LLM can read them immediately. And it's probably because the model training phase included a lot of code and related documentation.
    • Can the LLM follow the train of thought: 'The subtitles in the video are too long ➡️ they need to be trimmed to a reasonable length'? I was curious about this part.
    • The way I used the LLM to write this program was that I recognized the problem and used the LLM only for the solution. Next time, I think I should involve the LLM from the problem recognition stage.
  2. The Importance of Prompts

    • The initial prompt I entered to create this program was quite simple.

      Let's write TypeScript code that satisfies the following requirements.

      • Read a video, translate the language spoken by the speaker in the video (English) into Korean, and create a new video with Korean subtitles.

      First, let's discuss the components of the code and how it should be structured to meet the above requirements.

    • What if this initial prompt had been more comprehensive?

      • That the input video is over 3 hours long.
      • That the length of the subtitles should be neither too short nor too long, so that they are comfortable for people to read.
    • However, these were things I didn't think about when I first started creating this program.

      • I was thinking too vaguely. "I want to watch a video in English with Korean subtitles."
      • If I had thought a little more about what I wanted and made it more specific, and then started writing the prompt, I think I could have finished it much faster. Maybe I could have completed the program in one shot 🧙‍♂️
  3. How Much Should We Read the Code Generated by the LLM?

    • This is also something to think about. In the middle of completing the program, how deeply should we read the numerous codes written by the LLM?
    • I tried to read the code to some extent during the process of creating the program. So it took a little more time. I thought I needed to know at least how the code was structured and what libraries it used so that I could fix it if there was a problem.
    • It's a person's choice whether to read it deeply line by line or not to read it at all and leave everything to the LLM. And this will vary depending on the nature of the program.
    • I think we should at least read the code enough to understand the principle of operation.
  4. A Program that Works Clumsily

    • Last year, I started using LLMs in programming in earnest while creating a RAG system based on Singaporean labor law. There was one thing I realized while using LLMs for the RAG system itself and for the development of that system.
    • We can create a program that works clumsily very quickly. This automatic translator was made to 'work' with just two or three prompts.
    • In the initial proof-of-concept process, we can create a 'working' program with just a few ideas and prompts. But this level is not yet ready for the production stage. We need to go through more polishing to create a product that can be commercialized.
    • In order to improve the level of the LLM's work with the LLM based on the initial work, we need to understand the code written by the LLM to 'some extent'.
    • This is because it allows us to create a more refined product based on the initial work.

Now I'm ready to watch the Deep Dive into LLMs like ChatGPT video as I originally intended. I'm not sure how much I'll understand from this video, but I'm hoping I'll understand more than when it was originally in English.

Comments (0)

Checking login status...

No comments yet. Be the first to comment!