cogent3.core.alphabet.KmerAlphabet#

class KmerAlphabet(words: tuple[TStrOrBytes, ...], monomers: CharAlphabet[TStrOrBytes], k: int, gap: TStrOrBytes | None = None, missing: TStrOrBytes | None = None)#

k-mer alphabet represents complete non-monomer alphabets

Attributes:
gap_char
gap_index
missing_char
missing_index
moltype
motif_len
num_canonical

Methods

count(value, /)

Return number of occurrences of value.

from_index(kmer_index)

decodes an integer into a k-mer

from_indices(kmer_indices[, independent_kmer])

converts array of k-mer indices into an array of monomer indices

from_rich_dict(data)

returns an instance from a serialised dictionary

index(value[, start, stop])

Return first index of value.

is_valid(seq)

seq is valid for alphabet

to_index(seq[, validate])

encodes a k-mer as a single integer

to_indices(seq[, validate, independent_kmer])

returns a sequence of k-mer indices

to_json()

returns a serialisable string

to_rich_dict([for_pickle])

returns a serialisable dictionary

with_gap_motif([include_missing])

returns a new KmerAlphabet with the gap motif added

Notes

Differs from SenseCodonAlphabet case by representing all possible permutations of k-length of the provided monomer alphabet. More efficient mapping between integers and k-length strings