Go to file

songhyeonsu 3e5f823dd5 Add Korean LP dictionary and PGNet config - dict/kr_lp_dict.txt: 67 chars covering 4 plate types (10 digits + 40 usage hangul + 17 region hangul, dedup) - configs/kr_lp_pgnet.yml: PGNet config tuned for Korean LP (pad_num=67, max_text_length=10, valid_set=partvgg, infer_visual_type=CN) - setup_server.sh: symlink dict and config into PaddleOCR tree Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>		1 month ago
configs	Add Korean LP dictionary and PGNet config	1 month ago
data_gen	Initial scaffold for kr_lp_pgnet	1 month ago
dict	Add Korean LP dictionary and PGNet config	1 month ago
scripts	Add Korean LP dictionary and PGNet config	1 month ago
tools	Initial scaffold for kr_lp_pgnet	1 month ago
.gitignore	Initial scaffold for kr_lp_pgnet	1 month ago
README.md	Initial scaffold for kr_lp_pgnet	1 month ago

README.md

kr_lp_pgnet

PaddleOCR PGNet 기반 한국 번호판(LP) end-to-end 검출 + OCR 학습 프로젝트.

대상 번호판: 승용(흰), 영업용(노란), 전기차(파란 8자리), 화물·특수.

디렉토리 구조

kr_lp_pgnet/
├── configs/        # PGNet 학습 config (.yml)
├── dict/           # 문자 사전 (kr_lp_dict.txt)
├── data_gen/       # 합성 LP 이미지 생성기
├── scripts/        # 서버 셋업·학습 실행 셸 스크립트
└── tools/          # 라벨 검증·시각화 등 보조 스크립트

작업 분업

로컬 (Mac): config·dict·생성기·run script 작성 및 디버깅
원격 GPU 서버 (NVIDIA + CUDA): 합성 데이터 생성, 학습 실행
동기화: 이 repo는 git push/pull, 데이터·체크포인트는 git에 올리지 않음

서버 측 실행 순서

# 1. 최초 1회: 환경 셋업 (Paddle 설치 + PaddleOCR clone + pretrain weight 다운로드)
bash scripts/setup_server.sh

# 2. 합성 데이터 생성 (수십만장)
python data_gen/generate_synthetic.py --out_dir ../train_data/kr_lp_synth --num 200000

# 3. Step1: 합성 데이터로 pretrain
bash scripts/run_step1.sh

# 4. Step2: 실제 LP 데이터로 fine-tune
bash scripts/run_step2.sh

# 5. 추론 모델로 export
bash scripts/export_inference.sh

디렉토리 가정

서버에서는 다음 레이아웃을 가정:

~/workspace/
├── PaddleOCR/              # git clone PaddlePaddle/PaddleOCR
├── kr_lp_pgnet/            # 이 repo
└── train_data/
    ├── kr_lp_synth/        # 합성 데이터 (생성기 출력)
    └── kr_lp_real/         # 실제 촬영 LP 데이터