Scaffolding methods#

To use one of the described component here, you can import them from khloraascaf.

Functions#

From input files#

scaffolding(contig_attrs, contig_links, contig_starter, multiplicity_upperbound=MULT_UPB_DEF, presence_score_upperbound=PRESSCORE_UPB_DEF, solver=SOLVER_CBC, outdir=OUTDIR_DEF, instance_name=INSTANCE_NAME_DEF, debug=OUTDEBUG_DEF)#

Computes the scaffolding.

Parameters:
  • contig_attrs (Path) – Contigs’ attributes file path

  • contig_links (Path) – Contigs’ links file path

  • contig_starter (IdCT) – Starter contig’s identifier

  • multiplicity_upperbound (MultT, optional) – Multiplicities upper bound, by default MULT_UPB_DEF

  • presence_score_upperbound (PresScoreT, optional) – Presence score upper bound, by default PRESSCORE_UPB_DEF

  • solver (str, optional) – MILP solver to use (‘cbc’ or ‘gurobi’), by default SOLVER_CBC

  • outdir (Path, optional) – Output directory path, by default OUTDIR_DEF

  • instance_name (str, optional) – Instance name, by default INSTANCE_NAME_DEF

  • debug (bool, optional) – Output debug or not, by default False

Returns:

Result output directory path

Return type:

Path

Raises:

ScaffoldingError – If the scaffolding fails

Notes

A uniquely named directory will be created to store the result files.

From mathematical data#

combine_scaffolding_problems(mdcg, starter_vertex, solver, outdir, instance_name, debug=OUTDEBUG_DEF)#

Find a priority between the optimisation problems.

Parameters:
  • mdcg (MDCGraph) – Multiplied doubled contig graph

  • starter_vertex (OccOrCT) – Starter vertex

  • solver (str) – MILP solver to use (‘cbc’ or ‘gurobi’)

  • outdir (Path) – Output directory path

  • instance_name (str) – Instance’s name

  • debug (bool, optional) – Output debug or not, by default False

Returns:

Best ILP results

Return type:

tuple of ScaffoldingResult

Raises:

CombineScaffoldingError – The scaffolding combination has failed

Warning

Files in the output directory can be erased.

combine_repeat_scaffolding(opti_results, mdcg, starter_vertex, solver, outdir, instance_name, debug)#

Combine repeat scaffolding.

Parameters:
  • opti_results (tuple of ScaffoldingResult) – Best scaffolding results

  • mdcg (MDCGraph) – Multiplied doubled contig graph

  • starter_vertex (OccOrCT) – Starter vertex

  • solver (str) – Solver to use

  • outdir (Path) – Output directory path

  • instance_name (str) – Instance name

  • debug (bool) – To output debug files or not

Returns:

Best scaffolding results

Return type:

tuple of ScaffoldingResult

Raises:

RepeatScaffoldingError – If one of the repeat solve fails during the combination

combine_singlecopy_scaffolding(opti_results, mdcg, starter_vertex, solver, outdir, instance_name, debug)#

Combine best previous results with single copy scaffolding.

Parameters:
  • opti_results (tuple of ScaffoldingResult) – Best scaffolding results

  • mdcg (MDCGraph) – Multiplied doubled contig graph

  • starter_vertex (OccOrCT) – Starter vertex

  • solver (str) – Solver to use

  • outdir (Path) – Output directory path

  • instance_name (str) – Instance name

  • debug (bool) – To output debug files or not

Returns:

Best scaffolding results

Return type:

tuple of ScaffoldingResult

Raises:

SingleCopyScaffoldingError – One of the single copy region scaffolding has failed

scaffolding_region(region_id, mdcg, starter_vertex, solver, outdir, instance_name, fix_result=None, debug=OUTDEBUG_DEF)#

Scaffolding of a specific region.

Parameters:
  • region_id (RegionIDT) – Code of the region to scaffold

  • mdcg (MDCGraph) – Multiplied doubled contig graph

  • starter_vertex (OccOrCT) – Starter vertex

  • solver (str) – MILP solver to use (‘cbc’ or ‘gurobi’)

  • outdir (Path) – Output directory path

  • instance_name (str) – Instance’s name

  • fix_result (ScaffoldingResult, optional) – Previous scaffolding result, by default None

  • debug (bool, optional) – Output debug or not, by default False

Returns:

Scaffolding result

Return type:

ScaffoldingResult

Raises:
  • WrongRegionID – The given code of the regions is wrong

  • UnfeasibleIR – The combinatorial problem is unfeasible

  • UnfeasibleDR – The combinatorial problem is unfeasible

  • UnfeasibleSC – The combinatorial problem is unfeasible

Warning

Files in the output directory can be erased.