Fix and perspectives#

Fixes#

  • #FIXME be aware of multigraph in the input data

    • #TODO print error and exit program if multigraph

Output files#

Documentation#

  • #DOCU Map of regions

    • Filename format, c.f. fmt_map_of_regions_filename function

    • File format, c.f. write_map_of_regions function (read & write)

  • #DOCU Contigs of regions

    • Filename format, c.f. fmt_contigs_of_regions_filename function

    • File format, c.f. write_contigs_of_regions function (read & write)

  • Debug files

    • #DOCU Vertices of regions

      • Filename format, c.f. fmt_vertices_of_regions_filename function

      • File format, c.f. write_vertices_of_regions function (read & write)

    • #DOCU Repeated fragments canonicals

      • Filename format, c.f. fmt_repfrag_canonicals_filename function

      • File format, c.f. write_repfrag_canonicals function (read & write)

  • #DOCU the default directory is UID-like generated

    • yyyy-mm-dd_HH-MM-SS_<instance_name>

  • #DOCU Output a YAML with run information

    • <generated_output_directory>/io_config.yaml

      contig_attributes: <path>
      contig_links: <path>
      starter_id: C0
      multiplicity_upperbound: 4
      presence_score_upperbound: 1.0
      solver: cbc
      output_directory: <path>
      instance_name: khloraascaf
      debug: true
      
    • <generated_output_directory>/solutions.yaml

      instance_name: khloraascaf
      - ilp_combination:
          - ir
          - sc
        contigs_of_regions: <path>
        map_of_regions: <path>
      - ilp_combination:
          - dr
          - sc
        contigs_of_regions: <path>
        map_of_regions: <path>
      
    • <generated_output_directory>/debugs.yaml

      - ilp_combination:
          - ir
        starter_vertex: <occorc>
        ilp_status: Optimal
        opt_value: 3.0
        vertices_of_regions: <path>
        map_of_regions: <path>
        repfrag: <path>
      - ilp_combination:
          - dr
        starter_vertex: <occorc>
        ilp_status: Optimal
        opt_value: 3.0
        vertices_of_regions: <path>
        map_of_regions: <path>
        repfrag: <path>
      - ilp_combination:
          - ir
          - sc
        starter_vertex: <occorc>
        ilp_status: Optimal
        opt_value: 129.98000000000002
        vertices_of_regions: <path>
        map_of_regions: <path>
      - ilp_combination:
          - dr
          - sc
        starter_vertex: <occorc>
        ilp_status: Optimal
        opt_value: 129.98000000000002
        vertices_of_regions: <path>
        map_of_regions: <path>
      

Features#

  • #TODO run_example.sh script that use same data as function example

  • #TODO function there_is_a_solution() -> bool

  • #TODO function to extract from stdout the output directory uid

  • #IDEA implement one class for each metadata type? (io, solutions, all_scaffolding)

    • GOOD: use getters instead of constants key on a dict from which the logic is hidden

    • GOOD: use class method to retrieve an existing one (if exists, else raising)

    • GOOD: init only on creation of a new metadata

    • BAD: repeated keys as class constants (e.g. KEY_ILP_COMBINATION)

    • GOOD: hide keys

    • SOSO: functions bump_* become methods write_to_yaml (note: this is not “bump”…)

Next (research world)#

  • #IDEA think about e.g. several IR & how to adapt these functions bellow?

    • If the pulp model want to keep sub-paths with avoiding the previous repeated fragments to be paired, then is it sufficient to read the previous repeated regions given by the result object?

  • #IDEA should repeat to construct solution until no changes?

    .
    ├── ir
    │   ├── ir-ir
    │   │   ├── ir-ir-ir
    │   │   ├── ir-ir-dr
    │   │   └── ir-ir-un
    │   └── ir-dr
    └── dr
        ├── dr-ir
        └── dr-dr
    
    • For now, do just at maximum (ir-dr-un) or (dr-ir-un)