[Python] JSON 파일내용으로 mp3 만들기

Posted by Albert 88Day 15Hour 21Min 10Sec ago [2025-11-10]

json 내용

[
{
"no": 1,
"eng": "Which of the following descriptions of the image is incorrect?\n- (1) Many people are working on computers in an event hall.\n- (2) A banner reads 'DIVE 2024 IN BUSAN'.\n- (3) People are all lined up waiting for food.\n- (4) There are many lights installed on the ceiling.",
"img": "busan_dive.jpg"
},
{
"no": 2,
"eng": "Which of the following descriptions of the image is incorrect?\n- (1) There is a yellow sculpture on the left side of the image.\n- (2) The building has the words \"Local Stitch\" on it.\n- (3) The sculpture is red.\n- (4) The building has multiple floors.",
"img": "local_stitch.jpg"
},
{
"no": 3,
"eng": "Which of the following descriptions of the image is incorrect?\n- (1) There are people sitting inside the café.\n- (2) The wall behind the counter is yellow.\n- (3) A metal counter is visible.\n- (4) The people working behind the counter are wearing black clothes.",
"img": "local_stitch_terrarosa.jpg"
},
{
"no": 4,
"eng": "Which of the following descriptions of the image is incorrect?\n- (1) The building is made of brick.\n- (2) There is a coffee shop on the first floor of the building.\n- (3) The rooftop of the building has a green roof.\n- (4) There are several cars parked on the road in front of the building.",
"img": "mangwon.jpg"
},
{
"no": 5,
"eng": "Which of the following descriptions of the image is incorrect?\n- (1) The shop displays various types of bread.\n- (2) Price tags are visible on the top shelf bread.\n- (3) The clerk is standing facing forward.\n- (4) There are various types of baguettes inside the glass display.",
"img": "mangwon_bakery.jpg"
},
{
"no": 6,
"eng": "Which of the following descriptions of the image is incorrect?\n- (1) People are doing construction work in front of the store.\n- (2) The building on the right has the name Nutricle.\n- (3) The bakery cafe is named 'pomne verte'.\n- (4) A tree is placed in front of the store.",
"img": "sangam_interior.jpg"
},
{
"no": 7,
"eng": "Which of the following descriptions of the image is incorrect?\n- (1) You can see the outside scenery through large windows.\n- (2) People are gathered around a meeting table.\n- (3) The floor has a wood pattern.\n- (4) There are multiple fans installed on the ceiling.",
"img": "seolleung_terrarosa.jpg"
},
{
"no": 8,
"eng": "Which of the following descriptions of the image is incorrect?\n- (1) People are lining up to get on a bus.\n- (2) A person on the right is talking on the phone.\n- (3) There are trees lined up in front of the building.\n- (4) The tour bus is painted red.",
"img": "stanford_coffee.jpg"
}
]


mp3 만들기

from openai import OpenAI
from dotenv import load_dotenv
import os

load_dotenv()
api_key = os.getenv("OPENAI_API_KEY") ' 환경 변수에서 API 키를 가져옵니다.
client = OpenAI(api_key=api_key) ' OpenAI 클라이언트의 인스턴스를 생성합니다.

import json

' json 파일 열기
with open('../data/images/image_quiz_eng.json', 'r', encoding='utf-8') as f:
eng_dict = json.load(f)

eng_dict

voices = ['alloy', 'ash', 'coral', 'echo', 'fable', 'onyx', 'nova', 'sage' , 'shimmer']

for q in eng_dict:
no = q['no']
quiz = q['eng']
quiz = quiz.replace("- (1)", "- One.\t")
quiz = quiz.replace("- (2)", "- Two.\t")
quiz = quiz.replace("- (3)", "- Three.\t")
quiz = quiz.replace("- (4)", "- Four.\t")

print(no, quiz)

voice = voices[no % len(voices)] ' 문제 개수를 목소리 개수로 나눈 나머지 값으로 선택

response = client.audio.speech.create(
model="tts-1-hd",
voice=voice,
input=f''{no}. {quiz}',
)

response.write_to_file(f"../data/audio/{no}.mp3")


실행시 json 파일기준으로 data/audio/ 폴더에 순서대로 관련 문제mp3 파일 생성




LIST

Copyright © 2014 visionboy.me All Right Reserved.