Did you know? Programmers convert coffee to code.

If you like my articles, sponsor me a coffee.

At the other end of the link I post some results of my other hobbies, like playing bass. It is worth checking it out ;)

Day 5: genome sequencing

Yesterday I did some coding at CodinGame: I did my first hard training, Genome Sequencing.

As mentioned previously, I won’t share my solution here but my approach to solve the problem.

First I approached the task with De Bruijn graphs. It seemed a good idea and I have got results for all the test cases but the last one. There Was some problem with my result: it was too short and missed one of the subsequences. So I thought that hour of coding was a waste of my time.

So I did some different approaches and came up with the following “algorithm”:

  • filter out subsequences which are already in at least one other subsequence
  • the minimum length is the sum of lengths’ of the remaining subsequences
  • for all the permutations of the remaining sequences find out the combined sequence which has the smallest length

How do we come to find out the smallest sequence from the permutations? Well, this is really simple: we iterate through a list of permutations which has a defined order and we combine the next value with the already available combined sequence. Let’s see an example:

We have the following sub-sequences: AGATTA, GATTACA, and TACAGA. Because there is no sub-sequence which is contained in another so after the first step we continue with these three sub-sequences. The minimum length now is 19.

Now we create all 6 permutations of these sub-sequences and combine them to form a sequence:

  • GATTACA, TACAGA, AGATTA results in GATTACAGATTA with the length of 12 so this is our current minimum
  • GATTACA, AGATTA, TACAGA results in GATTACAGATTA with the length of 12 so the minimum length stays 12
  • TACAGA, GATTACA, AGATTA results in TACAGATTACA with a length of 11 so this will be our new minimum length
  • TACAGA, AGATTA, GATTACA results in TACAGATTACA with a length of 11 so the minimum does not change
  • AGATTA, GATTACA, TACAGA results in AGATTACAGA with a length of 10 so this will be the new minimum
  • AGATTA, TACAGA, GATTACA results in AGATTACAGA with a length of 10 so the minimum does not change

After the combinations we end up with a minimum length of 10. And this is the result for the example.

Share the knowledge!

Senior developer, consultant, author, mentor, apprentice. I love to share my knowledge and insights what I achieve through my daily work which is not trivial -- at least not for me.

Click Here to Leave a Comment Below

%d bloggers like this: