Core Subtitle Merging

This section documents the public API for performing subtitle merges programmatically.

Subtitle Data Structures

These classes represent structured subtitle information used throughout the system.

class duosubs.SubtitleData(subs: list[SubtitleField] = <factory>, styles: SSAFile = <factory>, tokens: list[str] = <factory>, styles_tokens: list[str] = <factory>)

Container for subtitle fields, style information, tokenized sentences, and style tokens.

subs

List of subtitle field objects.

Type:: list[SubtitleField]

styles

SSAFile object containing style information.

Type:: pysubs2.SSAFile

tokens

List of tokenized sentences from the subtitles.

Type:: list[str]

styles_tokens

List of style tokens corresponding to each sentence.

Type:: list[str]

class duosubs.SubtitleField(start: int = -1, end: int = -1, primary_token_spans: tuple[int, int] = (-1, -1), secondary_token_spans: tuple[int, int] = (-1, -1), primary_text: str = '', secondary_text: str = '', score: float = -1.0, primary_style: str = 'Default', secondary_style: str = 'Default')

Represents a single subtitle event with timing, text, style, and alignment score.

start

Start time of the primary subtitle in milliseconds. Defaults to -1.

Type:: int

end

End time of the primary subtitle in milliseconds. Defaults to -1.

Type:: int

secondary_token_spans

Tuple indicating the start and end token indices for the tokenized secondary subtitle. Defaults to (-1, -1).

Type:: tuple[int, int]

primary_text

Text content of the primary subtitle. Defaults to an empty string.

Type:: str

secondary_text

Text content of the secondary subtitle. Defaults to an empty string.

Type:: str

score

Alignment or similarity score between primary and secondary subtitles. Defaults to -1.0.

Type:: float

primary_style

Style name for the primary subtitle. Defaults to “Default”.

Type:: str

secondary_style

Style name for the secondary subtitle. Defaults to “Default”.

Type:: str

Subtitle Merging Pipeline

High-level functions and configuration classes for merging subtitles, including the main entry point and step-by-step processing stages.

Pipeline Configuration Options

class duosubs.MergeArgs(primary: Path | str = '', secondary: Path | str = '', model: str = 'LaBSE', device: DeviceType = DeviceType.AUTO, batch_size: int = 32, model_precision: ModelPrecision = ModelPrecision.FLOAT32, merging_mode: MergingMode = MergingMode.SYNCED, retain_newline: bool = False, secondary_above: bool = False, omit: List[OmitFile] = <factory>, format_all: SubtitleFormat | None = None, format_combined: SubtitleFormat | None = None, format_primary: SubtitleFormat | None = None, format_secondary: SubtitleFormat | None = None, output_name: str | None = None, output_dir: Path | None = None, ignore_non_overlap_filter: bool | None = None)[source]

Container for all arguments used in the subtitle merging workflow.

primary

Path to the primary language subtitle file. Defaults to “” (empty Path).

Type:: Path | str

secondary

Path to the secondary language subtitle file. Defaults to “” (empty Path).

Type:: Path | str

model

Name of the SentenceTransformer model to use. Defaults to “LaBSE”.

Type:: str

device

Device type for model inference (i.e., ‘cpu’, ‘cuda’, ‘mps’, ‘auto’). Defaults to “DeviceType.AUTO”.

Type:: DeviceType

batch_size

Batch size for model inference. Defaults to 32.

Type:: int

model_precision

Precision modes for model inference. Defaults to ModelPrecision.FLOAT32.

Type:: ModelPrecision

merging_mode

Mode for merging subtitles.

Options include

MergingMode.SYNCED (all timestamps synced and from the same cut)
MergingMode.MIXED (some timestamps synced and from the same cut)
MergingMode.CUTS (different cuts with primary being extended version)

Defaults to MergingMode.SYNCED.

Note that for MIXED and CUTS modes, it works best if both subtitles does not have any added scene annotations.

Type:: MergingMode

retain_newline

Whether to retain ‘N’ line breaks in output. Defaults to True.

Type:: bool

secondary_above

Whether to show secondary subtitle above primary. Defaults to True.

Type:: bool

omit

List of file types to omit from output. Defults to List[OmitFile.EDIT]

Type:: List[OmitFile]

format_all

File format for all subtitle outputs. Defaults to None.

Type:: Optional[SubtitleFormat]

format_combined

File format for combined subtitle output. Defaults to None.

Type:: Optional[SubtitleFormat]

format_primary

File format for primary subtitle output. Defaults to None.

Type:: Optional[SubtitleFormat]

format_secondary

File format for secondary subtitle output. Defaults to None.

Type:: Optional[SubtitleFormat]

output_name

Base name for output files (without extension). Defaults to None.

Type:: Optional[str]

output_dir

Output directory for generated files. Defaults to None.

Type:: Optional[Path]

ignore_non_overlap_filter

Deprecated since version 1.1.0.

Whether to ignore non-overlapping filter when merging subtitles.

If True, equivalent to MergingMode.MIXED. If False, equivalent to MergingMode.SYNCED.

Deprecated since v1.1.0 and will be removed in v2.0.0. Use merging_mode instead.

Type:: bool | None

Complete Merge Pipeline

duosubs.run_merge_pipeline(args: MergeArgs, logger: Callable[[str], None] | None = None) → None[source]

Run the full subtitle merging pipeline: load, merge, and save subtitles.

Parameters:

args (MergeArgs) – Arguments for subtitle merging workflow, requires all args attributes.
logger (Callable[[str], None], optional) – Optional logger for progress messages.

Raises:

LoadSubsError – If there is an error loading subtitles or if they are empty.
LoadModelError – If there is an error loading the sentence transformer model.
MergeSubsError – If there is an error merging subtitles.
SaveSubsError – If there is an error saving the merged subtitles.

Step-by-Step Pipeline Functions

duosubs.load_subtitles(args: MergeArgs, stage_logger: Callable[[], None] | None = None) → tuple[SubtitleData, SubtitleData][source]

Load primary and secondary subtitles, styles, and tokens.

Parameters:

args (MergeArgs) – Arguments for subtitle merging workflow, requires only args.primary, args.secondary.
stage_logger (Callable[[], None], optional) – Optional logger for this stage.

Returns:

Contains primary subtitles, styles, tokens and styles for each token.
Contains secondary subtitles, styles, tokens and styles for each token.

Return type:

tuple[SubtitleData, SubtitleData]

Raises:

LoadSubsError – If there is an error loading subtitles or if they are empty.

duosubs.load_sentence_transformer_model(args: MergeArgs, stage_logger: Callable[[str, str], None] | None = None) → SentenceTransformer[source]

Load a SentenceTransformer model on the specified device.

Parameters:

args (MergeArgs) – Arguments for subtitle merging workflow, requires only args.device, args.model, args.model_precision.
stage_logger (Callable[[str, str], None], optional) – Optional logger for this stage.

Returns:

Loaded sentence transformer model.

Return type:

SentenceTransformer

Raises:

LoadModelError – If there is an error loading the sentence transformer model.

duosubs.merge_subtitles(args: MergeArgs, model: SentenceTransformer, primary_subs_data: SubtitleData, secondary_subs_data: SubtitleData, stop_bit: list[bool], stage_logger: Callable[[], None] | None = None, progress_callback: Callable[[int], None] | None = None) → list[SubtitleField][source]

Merge primary and secondary subtitles using the provided model and configuration.

Parameters:

args (MergeArgs) – Arguments for subtitle merging workflow, requires only args.merging_mode, args.batch_size.
model (SentenceTransformer) – Loaded sentence transformer model.
primary_subs_data (SubtitleData) – Primary subtitles data, containing subtitles, styles, tokens, and styles for each token.
secondary_subs_data (SubtitleData) – Secondary subtitles data, containing subtitles, styles, tokens, and styles for each token.
stop_bit (list[bool]) – Flag to stop the merging process early.
stage_logger (Callable[[], None], optional) – Optional logger for this stage.
progress_callback (Callable[[int], None], optional) – Optional callback for progress updates.

Returns:

List of merged subtitle fields.

Return type:

list[SubtitleField]

Raises:

MergeSubsError – If there is an error merging subtitles.

duosubs.save_subtitles_in_zip(args: MergeArgs, subs: list[SubtitleField], primary_styles: SSAFile, secondary_styles: SSAFile, stage_logger: Callable[[str], None] | None = None) → None[source]

Save merged subtitles with their styles in a zip archive.

Parameters:

args (MergeArgs) – Arguments for subtitle merging workflow, requires only args.format_all, args.format_combined, args.format_primary, args.format_secondary, args.omit, args.primary, args.output_name, args.output_dir, args.secondary_above, args.retain_newline
subs (list[SubtitleField]) – List of merged subtitle fields.
primary_styles (pysubs2.SSAFile) – Primary subtitle styles.
secondary_styles (pysubs2.SSAFile) – Secondary subtitle styles.
stage_logger (Callable[[str], None], optional) – Optional logger for this stage.

Raises:

SaveSubsError – If there is an error saving the merged subtitles.

Advanced and Low-Level Utilities

Detailed classes and functions for advanced merging, direct file I/O, and in-memory operations.

Core Subtitle Merging Algorithm

class duosubs.Merger(primary_subs_data: SubtitleData, secondary_subs_data: SubtitleData)[source]

A class for merging and aligning primary and secondary subtitle streams using sentence embeddings. Provides methods for handling non-overlapping subtitle, estimating alignment and refining merges.

align_subs_using_neighbours(subs: list[SubtitleField], subtitle_window_size: int, model: SentenceTransformer, stage_number: int, stop_bit: list[bool], batch_size: int = 32, progress_callback: Callable[[int], None] | None = None) → tuple[list[SubtitleField], int][source]

Refines the alignment of merged subtitles based on sentence similarity using a sliding window of consecutive subtitle lines. Can be run in multiple stages for improved accuracy.

Parameters:

subs (list[SubtitleField]) – List of merged subtitles to refine.
subtitle_window_size (int) – Window size for refinement.
model (SentenceTransformer) – The sentence embedding model.
stage_number (int) – The number of refinement stage.
stop_bit (list[bool]) – A flag to allow early stopping.
batch_size (int, optional) – Batch size for model inference. Default is 32.
progress_callback (Callable[[int], None], optional) – Callback for progress updates.

Returns:

Refined sorted list of merged subtitles.
The updated number of refinement stage, incremented by 1.

Return type:

tuple[list[SubtitleField], int]

align_subs_with_dtw(model: SentenceTransformer, stop_bit: list[bool], batch_size: int = 32, progress_callback: Callable[[int], None] | None = None) → list[SubtitleField][source]

Aligns subtitles using Dynamic Time Warping (DTW) based on sentence embeddings.

Parameters:

model (SentenceTransformer) – The sentence embedding model.
stop_bit (list[bool]) – A flag to allow early stopping.
batch_size (int, optional) – Batch size for model inference. Default is 32.
progress_callback (Callable[[int], None], optional) – Callback for progress updates.

Returns:

List of processed subtitle fields after DTW alignment.

Return type:

list[SubtitleField]

eliminate_unnecessary_newline(subs: list[SubtitleField], stop_bit: list[bool], progress_callback: Callable[[int], None] | None = None) → list[SubtitleField][source]

Cleans up unnecessary newlines in subtitle text fields.

Parameters:

subs (list[SubtitleField]) – List of merged subtitles to clean.
stop_bit (list[bool]) – A flag to allow early stopping.
progress_callback (Callable[[int], None], optional) – Callback for progress updates.

Returns:

Cleaned list of merged subtitles.

Return type:

list[SubtitleField]

extract_non_overlapping_subs(stop_bit: list[bool], progress_callback: Callable[[int], None] | None = None) → tuple[list[SubtitleField], list[SubtitleField]][source]

Extracts non-overlapping subtitles from primary and secondary streams, and delete non-overlapping subtitles from the original streams for further processing.

Parameters:

stop_bit (list[bool]) – A flag to allow early stopping.
progress_callback (Callable[[int], None], optional) – Callback for progress updates.

Returns:

List of non-overlapping primary subtitles.
List of non-overlapping secondary subtitles.

Return type:

tuple[list[SubtitleField], list[SubtitleField]]

filter_and_extract_extended_version(subs: list[SubtitleField], model: SentenceTransformer, stop_bit: list[bool], batch_size: int = 32, progress_callback: Callable[[int], None] | None = None) → tuple[list[SubtitleField], list[SubtitleField]][source]

Filters and extracts extended segments from merged subtitles.

First, detects possible extended segments from binary mask, with 1 indicating presence of secondary text, and 0 otherwise, which are further denoised by HMM.

Then, filters out the extended segments based on the best sentence similarity, by progressively deleting the extended subtitle line, starting from the middle of the extended segments.

Parameters:

subs (list[SubtitleField]) – List of merged subtitles to filter.
model (SentenceTransformer) – The sentence embedding model.
stop_bit (list[bool]) – A flag to allow early stopping.
batch_size (int, optional) – Batch size for model inference. Default is 32.
progress_callback (Callable[[int], None], optional) – Callback for progress updates.

Returns:

Filtered list of merged subtitles.
List of subtitles representing extended cut.

Return type:

tuple[list[SubtitleField], list[SubtitleField]]

merge_subtitle(model: SentenceTransformer, stop_bit: list[bool], ignore_non_overlap_filter: bool = False, batch_size: int = 32, progress_callback: Callable[[int], None] | None = None) → list[SubtitleField][source]

Main method to merge and align subtitles using a sentence embedding model. First extract non-overlapping subtitles, then performs Dynamic Time Warping (DTW) alignment, and finally refines subtitle merges using a sliding window approach.

Parameters:

model (SentenceTransformer) – The sentence embedding model.
stop_bit (list[bool]) – A flag to allow early stopping.
ignore_non_overlap_filter (bool, optional) – Flag to ignore non-overlapping filter stage. Default is False.
batch_size (int, optional) – Batch size for model inference. Default is 32.
progress_callback (Callable[[int], None], optional) – Callback for progress updates.

Returns:

List of merged and sorted subtitle fields after complete alignment.

Return type:

list[SubtitleField]

merge_subtitle_extended_cut(model: SentenceTransformer, stop_bit: list[bool], batch_size: int = 32, progress_callback: Callable[[int], None] | None = None) → list[SubtitleField][source]

Merges subtitles if the primary subtitle is an extended cut version, skipping non-overlapping extraction and applying additional filtering of extended cut and refinement steps.

Parameters:

model (SentenceTransformer) – The sentence embedding model.
stop_bit (list[bool]) – A flag to allow early stopping.
batch_size (int, optional) – Batch size for model inference. Default is 32.
progress_callback (Callable[[int], None], optional) – Callback for progress updates.

Returns:

List of merged and sorted subtitle fields after alignment.

Return type:

list[SubtitleField]

Subtitle File Loading Utilities

duosubs.load_file_edit(file_path: Path | str) → tuple[list[SubtitleField], SSAFile, SSAFile][source]

Loads a compressed subtitle edit file, including subtitle fields and style information.

Parameters:

file_path (Path | str) – Path to the compressed edit file (.json.gz).

Returns:

List of SubtitleField objects representing the subtitles.
SSAFile object containing primary styles.
SSAFile object containing secondary styles.

Return type:

tuple[list[SubtitleField], pysubs2.SSAFile, pysubs2.SSAFile]

duosubs.load_subs(subtitle_path: Path | str) → SubtitleData[source]

Loads subtitles, tokenizes sentences, and extracts style and token information.

Parameters:

subtitle_path (Path | str) – Path to the subtitle file.

Returns:

List of SubtitleField objects for each subtitle event. (list[SubtitleField])
SSAFile object containing style information. (pysubs2.SSAFile)
List of tokenized sentences from the subtitles. (list[str])
List of style tokens corresponding to each sentence. (list[str])

Return type:

SubtitleData

Subtitle File Writing Utilities

duosubs.save_file_combined(sub_list: list[SubtitleField], primary_styles: SSAFile, secondary_styles: SSAFile, save_path: Path | str, secondary_above: bool = True, retain_newline: bool = True) → None[source]

Save combined bilingual subtitles to a file, with both primary and secondary text in each event.

Parameters:

sub_list (list[SubtitleField]) – List of subtitle fields to save.
primary_styles (pysubs2.SSAFile) – Primary subtitle styles.
secondary_styles (pysubs2.SSAFile) – Secondary subtitle styles.
save_path (Path | str) – Path to save the subtitle file.
secondary_above (bool, optional) – If True, secondary text is above primary. Defaults to True.
retain_newline (bool, optional) – If True, retain line breaks. Defaults to True.

duosubs.save_file_edit(sub_list: list[SubtitleField], primary_styles: SSAFile, secondary_styles: SSAFile, save_path: Path | str) → None[source]

Save subtitle data and styles to a compressed edit file (.json.gz).

Parameters:

sub_list (list[SubtitleField]) – List of subtitle fields to save.
primary_styles (pysubs2.SSAFile) – Primary subtitle styles.
secondary_styles (pysubs2.SSAFile) – Secondary subtitle styles.
save_path (Path | str) – Path to save the compressed file (without .json.gz extension).

duosubs.save_file_separate(sub_list: list[SubtitleField], primary_styles: SSAFile, secondary_styles: SSAFile, save_path_primary: Path | str, save_path_secondary: Path | str, retain_newline: bool = True) → None[source]

Save primary and secondary subtitles as separate files.

Parameters:

sub_list (list[SubtitleField]) – List of subtitle fields to save.
primary_styles (pysubs2.SSAFile) – Primary subtitle styles.
secondary_styles (pysubs2.SSAFile) – Secondary subtitle styles.
save_path_primary (Path | str) – Path to save the primary subtitle file.
save_path_secondary (Path | str) – Path to save the secondary subtitle file.
retain_newline (bool, optional) – If True, retain line breaks. Defaults to True.

duosubs.save_memory_combined(sub_list: list[SubtitleField], primary_styles: SSAFile, secondary_styles: SSAFile, extension_fmt: str, secondary_above: bool = True, retain_newline: bool = True) → bytes[source]

Save combined bilingual subtitles to memory as a string.

Parameters:

sub_list (list[SubtitleField]) – List of subtitle fields to save.
primary_styles (pysubs2.SSAFile) – Primary subtitle styles.
secondary_styles (pysubs2.SSAFile) – Secondary subtitle styles.
extension_fmt (str) – Subtitle file format/extension.
secondary_above (bool, optional) – If True, secondary text is above primary. Defaults to True.
retain_newline (bool, optional) – If True, retain line breaks. Defaults to True.

Returns:

Subtitle file content as a UTF-8 encoded bytes.

Return type:

bytes

duosubs.save_memory_edit(sub_list: list[SubtitleField], primary_styles: SSAFile, secondary_styles: SSAFile) → bytes[source]

Save subtitle data and styles to a compressed edit file in memory.

Parameters:

sub_list (list[SubtitleField]) – List of subtitle fields to save.
primary_styles (pysubs2.SSAFile) – Primary subtitle styles.
secondary_styles (pysubs2.SSAFile) – Secondary subtitle styles.

Returns:

Compressed file content as bytes.

Return type:

bytes

duosubs.save_memory_separate(sub_list: list[SubtitleField], primary_styles: SSAFile, secondary_styles: SSAFile, extension_primary: str, extension_secondary: str, retain_newline: bool = True) → tuple[bytes, bytes][source]

Save primary and secondary subtitles as separate files in memory.

Parameters:

sub_list (list[SubtitleField]) – List of subtitle fields to save.
primary_styles (pysubs2.SSAFile) – Primary subtitle styles.
secondary_styles (pysubs2.SSAFile) – Secondary subtitle styles.
extension_primary (str) – File format/extension for primary subtitles.
extension_secondary (str) – File format/extension for secondary subtitles.
retain_newline (bool, optional) – If True, retain line breaks. Defaults to True.

Returns:

UTF-8 encoded bytes for primary subtitles.
UTF-8 encoded bytes for secondary subtitles.

Return type:

tuple[bytes, bytes]

Exception Classes

Custom exceptions for error handling during subtitle loading, model inference, merging, and saving.

class duosubs.LoadSubsError(message: str, original_exception: Exception | None = None)

Bases: Exception

Exception raised when loading subtitle files fails.

Parameters:

message (str) – Description of the error.
original_exception (Exception, optional) – The original exception that caused the error.

class duosubs.LoadModelError(message: str, original_exception: Exception | None = None)

Bases: Exception

Exception raised when loading the sentence transformer model fails.

Parameters:

message (str) – Description of the error.
original_exception (Exception, optional) – The original exception that caused the error.

class duosubs.MergeSubsError(message: str, original_exception: Exception | None = None)

Bases: Exception

Exception raised when merging subtitles fails.

Parameters:

message (str) – Description of the error.
original_exception (Exception, optional) – The original exception that caused the error.

class duosubs.SaveSubsError(message: str, original_exception: Exception | None = None)

Bases: Exception

Exception raised when saving subtitle files fails.

Parameters:

message (str) – Description of the error.
original_exception (Exception, optional) – The original exception that caused the error.

Configuration Enums

Enumerations for device selection, model precision, file omission, and supported subtitle formats.

class duosubs.DeviceType(*values)

Enum for device types used to run the model.

CPU

Use CPU for computation.

Type:: str

CUDA

Use Nvidia or AMD GPU for computation.

Type:: str

MPS

Use Apple Metal Performance Shaders for computation.

Type:: str

AUTO

Automatically select the best available device.

Type:: str

class duosubs.ModelPrecision(*values)

Enum options for precision mode used in model inference.

FLOAT32

32-bit floating point (full precision)

Type:: str

FLOAT16

16-bit floating point (half precision)

Type:: str

BFLOAT16

Brain Floating Point 16-bit precision (optimized for newer hardware)

Type:: str

to_torch_dtype() → dtype[source]

Converts the precision enum value to a corresponding PyTorch dtype.

Returns:: Corresponding PyTorch dtype for the precision.
Return type:: torch.dtype

class duosubs.MergingMode(*values)[source]

Enum for subtitle merging modes.

SYNCED

All timestamps overlap and both subtitles are from same cut.

Type:: str

MIXED

Some timestamps not overlap and both subtitles are from same cut.

Type:: str

CUTS

Both subtitles are different cuts, with primary subtitles being the extended or longer versions.

Type:: str

class duosubs.OmitFile(*values)

Enum for file types that can be omitted from output packaging.

NONE

No file is omitted.

Type:: str

COMBINED

Combined subtitle file.

Type:: str

PRIMARY

Primary subtitle file.

Type:: str

SECONDARY

Secondary subtitle file.

Type:: str

EDIT

Edit file (e.g., for project or intermediate data).

Type:: str

class duosubs.SubtitleFormat(*values)

Enum for supported subtitle file formats.

SRT

SubRip subtitle format (‘.srt’).

Type:: str

VTT

WebVTT subtitle format (‘.vtt’).

Type:: str

MPL2

MPL2 subtitle format (‘.mpl2’).

Type:: str

TTML

Timed Text Markup Language format (‘.ttml’).

Type:: str

ASS

Advanced SubStation Alpha format (‘.ass’).

Type:: str

SSA

SubStation Alpha format (‘.ssa’).

Type:: str