实验阶段,我们搜集了如下数据:
·1000个cryptolocker域名;
·1000个post-tovar-goz域名;
·alexa前1000域名。
DGA文件格式如下:
xsxqeadsbgvpdke.co.uk,Domain used by Cryptolocker - Flashback DGA for 13 Apr 2017,2017-04-13,http://osint.bambenekconsulting.com/manual/cl.txt
从DGA文件中提取域名数据:
def load_dga(filename): domain_list=[] #xsxqeadsbgvpdke.co.uk,Domain used by Cryptolocker - Flashback DGA for 13 Apr 2017,2017-04-13, # http://osint.bambenekconsulting.com/manual/cl.txt with open(filename) as f: for line in f: domain=line.split(",")[0] if domain >= MIN_LEN: domain_list.append(domain) return domain_list
alexa文件使用CSV格式保存域名的排名和域名,提取数据方式如下:
def load_alexa(filename): domain_list=[] csv_reader = csv.reader(open(filename)) for row in csv_reader: domain=row[1] if domain >= MIN_LEN: domain_list.append(domain) return domain_list