187. Repeated DNA Sequences

Problem:
All DNA is composed of a series of nucleotides abbreviated as A, C, G, and T, for example: "ACGAATTCCG". When studying DNA, it is sometimes useful to identify repeated sequences within the DNA.
Write a function to find all the 10-letter-long sequences (substrings) that occur more than once in a DNA molecule.
For example,
Given s = "AAAAACCCCCAAAAACCCCCCAAAAAGGGTTT",

Return:
["AAAAACCCCC", "CCCCCAAAAA"].
Analysis:
用bit manipulation 太难了,想不出来。
Use two sets, one stores distinct string, the other stores repeated.

Solution:

class Solution {
    public List<String> findRepeatedDnaSequences(String s) {
        List<String> res = new ArrayList<>();
        Set<String> seen = new HashSet<>();
        Set<String> repeated = new HashSet<>();
        if (s == null || s.length() == 0) return res;
        for (int i = 0; i <= s.length() - 10; i++) {
            String temp = s.substring(i, i + 10);
            if (!seen.add(temp)) {
                repeated.add(temp);
            }
        }
        res.addAll(repeated);
        return res;
    }
}

评论

此博客中的热门博文

776. Split BST

663. Equal Tree Partition

532. K-diff Pairs in an Array