Changeset 863:f28df3d85602
- Timestamp:
- 01/16/08 13:01:09 (1 year ago)
- Files:
-
- tool_conf.xml.main (modified) (3 diffs)
- tool_conf.xml.sample (modified) (3 diffs)
- tools/extract/genebed_maf_to_fasta.py (deleted)
- tools/extract/interval2maf_pairwise.py (deleted)
- tools/extract/interval_maf_to_merged_fasta.py (deleted)
- tools/maf/genebed_maf_to_fasta.xml (moved) (moved from tools/extract/genebed_maf_to_fasta.xml) (1 diff)
- tools/maf/interval2maf.py (moved) (moved from tools/extract/interval2maf.py) (5 diffs)
- tools/maf/interval2maf.xml (moved) (moved from tools/extract/interval2maf.xml)
- tools/maf/interval2maf_pairwise.xml (moved) (moved from tools/extract/interval2maf_pairwise.xml) (1 diff)
- tools/maf/interval_maf_to_merged_fasta.py (added)
- tools/maf/interval_maf_to_merged_fasta.xml (moved) (moved from tools/extract/interval_maf_to_merged_fasta.xml) (1 diff)
- tools/maf/maf_by_block_number.py (moved) (moved from tools/filters/maf/maf_by_block_number.py) (1 diff)
- tools/maf/maf_by_block_number.xml (moved) (moved from tools/filters/maf/maf_by_block_number.xml)
- tools/maf/maf_filter.py (moved) (moved from tools/filters/maf/maf_filter.py)
- tools/maf/maf_filter.xml (moved) (moved from tools/filters/maf/maf_filter.xml)
- tools/maf/maf_limit_size.py (moved) (moved from tools/filters/maf/maf_limit_size.py)
- tools/maf/maf_limit_size.xml (moved) (moved from tools/filters/maf/maf_limit_size.xml)
- tools/maf/maf_limit_to_species.py (moved) (moved from tools/filters/maf/maf_limit_to_species.py)
- tools/maf/maf_limit_to_species.xml (moved) (moved from tools/filters/maf/maf_limit_to_species.xml)
- tools/maf/maf_reverse_complement.py (moved) (moved from tools/filters/maf/maf_reverse_complement.py)
- tools/maf/maf_reverse_complement.xml (moved) (moved from tools/filters/maf/maf_reverse_complement.xml)
- tools/maf/maf_stats.py (moved) (moved from tools/filters/maf/maf_stats.py) (6 diffs)
- tools/maf/maf_stats.xml (moved) (moved from tools/filters/maf/maf_stats.xml)
- tools/maf/maf_stats_code.py (moved) (moved from tools/filters/maf/maf_stats_code.py)
- tools/maf/maf_thread_for_species.py (moved) (moved from tools/filters/maf/maf_thread_for_species.py) (1 diff)
- tools/maf/maf_thread_for_species.xml (moved) (moved from tools/filters/maf/maf_thread_for_species.xml)
- tools/maf/maf_to_bed.py (moved) (moved from tools/filters/maf/maf_to_bed.py) (1 diff)
- tools/maf/maf_to_bed.xml (moved) (moved from tools/filters/maf/maf_to_bed.xml)
- tools/maf/maf_to_bed_code.py (moved) (moved from tools/filters/maf/maf_to_bed_code.py)
- tools/maf/maf_to_fasta.xml (moved) (moved from tools/filters/maf/maf_to_fasta.xml)
- tools/maf/maf_to_fasta_concat.py (moved) (moved from tools/filters/maf/maf_to_fasta_concat.py) (5 diffs)
- tools/maf/maf_to_fasta_multiple_sets.py (moved) (moved from tools/filters/maf/maf_to_fasta_multiple_sets.py) (3 diffs)
- tools/maf/maf_utilities.py (added)
Legend:
- Unmodified
- Added
- Removed
- Modified
- Copied
- Moved
tool_conf.xml.main
r796 r863 22 22 </section> 23 23 <section name="ENCODE Tools" id="EncodeTools"> 24 <!-- <tool file=" extract/interval2maf.xml" />24 <!-- <tool file="maf/interval2maf.xml" /> 25 25 <tool file="extract/phastOdds/phastOdds_tool.xml" /> 26 26 <tool file="stats/aggregate_binned_scores_in_intervals.xml" /> --> … … 53 53 </section> 54 54 <section name="Convert Formats" id="convert"> 55 <tool file=" filters/maf/maf_to_fasta.xml" />56 <tool file=" filters/maf/maf_to_bed.xml" />55 <tool file="maf/maf_to_fasta.xml" /> 56 <tool file="maf/maf_to_bed.xml" /> 57 57 <tool file="filters/gff2bed.xml" /> 58 58 <tool file="filters/bed2gff.xml" /> … … 74 74 </section> 75 75 <section name="Fetch Alignments" id="fetchAlign"> 76 <tool file="extract/interval2maf_pairwise.xml" /> 77 <tool file="extract/interval2maf.xml" /> 78 <tool file="extract/interval_maf_to_merged_fasta.xml" /> 79 <!-- <tool file="extract/genebed_maf_to_fasta.xml"/> 80 <tool file="filters/maf/maf_stats.xml"/> --> 81 <tool file="filters/maf/maf_limit_to_species.xml"/> 82 <tool file="filters/maf/maf_limit_size.xml"/> 83 <tool file="filters/maf/maf_by_block_number.xml"/> 76 <tool file="maf/interval2maf_pairwise.xml" /> 77 <tool file="maf/interval2maf.xml" /> 78 <tool file="maf/interval_maf_to_merged_fasta.xml" /> 79 <tool file="maf/genebed_maf_to_fasta.xml"/> 80 <tool file="maf/maf_stats.xml"/> 81 <tool file="maf/maf_thread_for_species.xml"/> 82 <tool file="maf/maf_limit_to_species.xml"/> 83 <tool file="maf/maf_limit_size.xml"/> 84 <tool file="maf/maf_by_block_number.xml"/> 85 <!-- <tool file="maf/maf_reverse_complement.xml"/> 86 <tool file="maf/maf_filter.xml"/> --> 84 87 </section> 85 88 <section name="Get Genomic Scores" id="scores"> tool_conf.xml.sample
r848 r863 22 22 </section> 23 23 <section name="ENCODE Tools" id="EncodeTools"> 24 <!-- <tool file=" extract/interval2maf.xml" />24 <!-- <tool file="maf/interval2maf.xml" /> 25 25 <tool file="extract/phastOdds/phastOdds_tool.xml" /> 26 26 <tool file="stats/aggregate_binned_scores_in_intervals.xml" /> --> … … 53 53 </section> 54 54 <section name="Convert Formats" id="convert"> 55 <tool file=" filters/maf/maf_to_fasta.xml" />56 <tool file=" filters/maf/maf_to_bed.xml" />55 <tool file="maf/maf_to_fasta.xml" /> 56 <tool file="maf/maf_to_bed.xml" /> 57 57 <tool file="filters/gff2bed.xml" /> 58 58 <tool file="filters/bed2gff.xml" /> … … 74 74 </section> 75 75 <section name="Fetch Alignments" id="fetchAlign"> 76 <tool file=" extract/interval2maf_pairwise.xml" />77 <tool file=" extract/interval2maf.xml" />78 <tool file=" extract/interval_maf_to_merged_fasta.xml" />79 <tool file=" extract/genebed_maf_to_fasta.xml"/>80 <tool file=" filters/maf/maf_stats.xml"/>81 <tool file=" filters/maf/maf_thread_for_species.xml"/>82 <tool file=" filters/maf/maf_limit_to_species.xml"/>83 <tool file=" filters/maf/maf_limit_size.xml"/>84 <tool file=" filters/maf/maf_by_block_number.xml"/>85 <tool file=" filters/maf/maf_reverse_complement.xml"/>86 <tool file=" filters/maf/maf_filter.xml"/>76 <tool file="maf/interval2maf_pairwise.xml" /> 77 <tool file="maf/interval2maf.xml" /> 78 <tool file="maf/interval_maf_to_merged_fasta.xml" /> 79 <tool file="maf/genebed_maf_to_fasta.xml"/> 80 <tool file="maf/maf_stats.xml"/> 81 <tool file="maf/maf_thread_for_species.xml"/> 82 <tool file="maf/maf_limit_to_species.xml"/> 83 <tool file="maf/maf_limit_size.xml"/> 84 <tool file="maf/maf_by_block_number.xml"/> 85 <tool file="maf/maf_reverse_complement.xml"/> 86 <tool file="maf/maf_filter.xml"/> 87 87 </section> 88 88 <section name="Get Genomic Scores" id="scores"> tools/maf/genebed_maf_to_fasta.xml
r765 r863 1 1 <tool id="GeneBed_Maf_Fasta2" name="Stitch Gene blocks"> 2 2 <description>given a set of coding exon intervals</description> 3 <command interpreter="python2.4">#if $maf_source_type.maf_source == "user":# genebed_maf_to_fasta.py $dbkey $maf_source_type.species $maf_source_type.maf_file $input1 $out_file1 $maf_source_type.maf_source4 #else:# genebed_maf_to_fasta.py $dbkey $maf_source_type.species $maf_source_type.maf_identifier $input1 $out_file1 $maf_source_type.maf_source3 <command interpreter="python2.4">#if $maf_source_type.maf_source == "user":#interval_maf_to_fasta.py --dbkey=$dbkey --species=$maf_source_type.species --mafSource=$maf_source_type.maf_file --interval_file=$input1 --output_file=$out_file1 --mafSourceType=$maf_source_type.maf_source --geneBED 4 #else:#interval_maf_to_fasta.py --dbkey=$dbkey --species=$maf_source_type.species --mafSource=$maf_source_type.maf_identifier --interval_file=$input1 --output_file=$out_file1 --mafSourceType=$maf_source_type.maf_source --geneBED 5 5 #end if 6 6 </command> tools/maf/interval2maf.py
r749 r863 20 20 -o, --output_file=o: Output MAF file 21 21 -p, --species=p: Species to include in output 22 -l, --indexLocation=l: Override default maf_index.loc file 22 23 """ 23 24 … … 27 28 import bx.align.maf 28 29 import bx.intervals.io 29 import bx.interval_index_file 30 import sys, os, tempfile 31 32 MAF_LOCATION_FILE = "/depot/data2/galaxy/maf_index.loc" 33 34 def maf_index_by_uid( maf_uid ): 35 for line in open( MAF_LOCATION_FILE ): 36 try: 37 #read each line, if not enough fields, go to next line 38 if line[0:1] == "#" : continue 39 fields = line.split('\t') 40 if maf_uid == fields[1]: 41 try: 42 maf_files = fields[3].replace( "\n", "" ).replace( "\r", "" ).split( "," ) 43 return bx.align.maf.MultiIndexed( maf_files, keep_open = True, parse_e_rows = True ) 44 except Exception, e: 45 raise 'MAF UID (%s) found, but configuration appears to be malformed: %s' % ( maf_uid, e ) 46 except: 47 pass 48 return None 49 50 #builds and returns (index, index_filename) for specified maf_file 51 def build_maf_index( maf_file, species = None ): 52 indexes = bx.interval_index_file.Indexes() 53 try: 54 maf_reader = bx.align.maf.Reader( open( maf_file ) ) 55 # Need to be a bit tricky in our iteration here to get the 'tells' right 56 while True: 57 pos = maf_reader.file.tell() 58 block = maf_reader.next() 59 if block is None: break 60 for c in block.components: 61 if species is not None and c.src.split( "." )[0] not in species: 62 continue 63 indexes.add( c.src, c.forward_strand_start, c.forward_strand_end, pos ) 64 fd, index_filename = tempfile.mkstemp() 65 out = os.fdopen( fd, 'w' ) 66 indexes.write( out ) 67 out.close() 68 return ( bx.align.maf.Indexed( maf_file, index_filename = index_filename, keep_open = True, parse_e_rows = True ), index_filename ) 69 except: 70 return ( None, None ) 30 import maf_utilities 31 import sys 71 32 72 33 def __main__(): … … 74 35 mincols = 0 75 36 76 # Parse Command Line37 #Parse Command Line 77 38 options, args = doc_optparse.parse( __doc__ ) 78 39 … … 117 78 print >>sys.stderr, "Output file has not been specified." 118 79 sys.exit() 80 #Finish parsing command line 119 81 120 82 #Open indexed access to MAFs 121 83 if options.mafType: 122 index = maf_index_by_uid( options.mafType ) 84 if options.indexLocation: 85 index = maf_utilities.maf_index_by_uid( options.mafType, options.indexLocation ) 86 else: 87 index = maf_utilities.maf_index_by_uid( options.mafType ) 123 88 if index is None: 124 89 print >> sys.stderr, "The MAF source specified (%s) appears to be invalid." % ( options.mafType ) 125 90 sys.exit() 126 91 elif options.mafFile: 127 index, index_filename = build_maf_index( options.mafFile, species = [dbkey] )92 index, index_filename = maf_utilities.build_maf_index( options.mafFile, species = [dbkey] ) 128 93 if index is None: 129 94 print >> sys.stderr, "Your MAF file appears to be malformed." … … 133 98 sys.exit() 134 99 100 #Create MAF writter 135 101 out = bx.align.maf.Writer( open(output_file, "w") ) 136 102 137 # Iterate over input regions103 #Iterate over input regions 138 104 num_blocks = 0 139 num_lines = 0 140 for num_lines, region in enumerate( bx.intervals.io.NiceReaderWrapper( open( interval_file, 'r' ), chrom_col = chromCol, start_col = startCol, end_col = endCol, strand_col = strandCol, fix_strand = True, return_header = False, return_comments = False ) ): 141 try: 142 src = "%s.%s" % ( dbkey, region.chrom ) 143 144 blocks = index.get( src, region.start, region.end ) 145 146 for block in blocks: 147 ref = block.get_component_by_src( src ) 148 #We want our block coordinates to be from positive strand 149 if ref.strand == "-": 150 block = block.reverse_complement() 151 ref = block.get_component_by_src( src ) 152 153 #save old score here for later use 154 old_score = block.score 155 slice_start = max( region.start, ref.start ) 156 slice_end = min( region.end, ref.end ) 157 158 #when interval is out-of-range (not in maf index), fail silently: else could create tons of scroll 159 try: 160 block = block.slice_by_component( ref, slice_start, slice_end ) 161 except: 162 continue 163 164 if block.text_size > mincols: 165 if region.strand != ref.strand: block = block.reverse_complement() 166 # restore old score, may not be accurate, but it is better than 0 for everything 167 block.score = old_score 168 if species is not None: 169 block = block.limit_to_species( species ) 170 block.remove_all_gap_columns() 171 out.write( block ) 172 num_blocks += 1 173 except Exception, e: 174 print "Error found on input line %s: %s." % ( num_lines, e ) 175 continue 105 num_regions = None 106 for num_regions, region in enumerate( bx.intervals.io.NiceReaderWrapper( open( interval_file, 'r' ), chrom_col = chromCol, start_col = startCol, end_col = endCol, strand_col = strandCol, fix_strand = True, return_header = False, return_comments = False ) ): 107 src = "%s.%s" % ( dbkey, region.chrom ) 108 for block in maf_utilities.get_chopped_blocks_for_region( index, src, region, species, mincols ): 109 out.write( block ) 110 num_blocks += 1 176 111 177 # Close output MAF112 #Close output MAF 178 113 out.close() 179 114 180 115 #remove index file if created during run 181 if index_filename is not None: 182 os.unlink( index_filename ) 116 maf_utilities.remove_temp_index_file( index_filename ) 183 117 184 print "%s MAF blocks extracted." % num_blocks 118 if num_blocks: 119 print "%i MAF blocks extracted for %i regions." % ( num_blocks, ( num_regions + 1 ) ) 120 elif num_regions is not None: 121 print "No MAF blocks could be extracted for %i regions." % ( num_regions + 1 ) 122 else: 123 print "No valid regions have been provided." 185 124 186 125 if __name__ == "__main__": __main__() tools/maf/interval2maf_pairwise.xml
r760 r863 1 1 <tool id="Interval2Maf_pairwise1" name="Extract Pairwise MAF blocks"> 2 2 <description>given a set of genomic intervals</description> 3 <command interpreter="python2.4">interval2maf _pairwise.py --dbkey=$dbkey --chromCol=$input1_chromCol --startCol=$input1_startCol --endCol=$input1_endCol --strandCol=$input1_strandCol --mafType=$mafType --interval_file=$input1 --output_file=$out_file1</command>3 <command interpreter="python2.4">interval2maf.py --dbkey=$input1_dbkey --chromCol=$input1_chromCol --startCol=$input1_startCol --endCol=$input1_endCol --strandCol=$input1_strandCol --mafType=$mafType --interval_file=$input1 --output_file=$out_file1 --indexLocation=/depot/data2/galaxy/maf_pairwise.loc</command> 4 4 <inputs> 5 5 <param name="input1" type="data" format="interval" label="Interval File"/> tools/maf/interval_maf_to_merged_fasta.xml
r765 r863 1 1 <tool id="Interval_Maf_Merged_Fasta2" name="Stitch MAF blocks"> 2 2 <description>given a set of genomic intervals</description> 3 <command interpreter="python2.4">#if $maf_source_type.maf_source == "user":#interval_maf_to_merged_fasta.py $dbkey $maf_source_type.species $maf_source_type.maf_file $input1 $out_file1 $input1_chromCol $input1_startCol $input1_endCol $input1_strandCol$maf_source_type.maf_source4 #else:#interval_maf_to_merged_fasta.py $dbkey $maf_source_type.species $maf_source_type.maf_identifier $input1 $out_file1 $input1_chromCol $input1_startCol $input1_endCol $input1_strandCol$maf_source_type.maf_source3 <command interpreter="python2.4">#if $maf_source_type.maf_source == "user":#interval_maf_to_merged_fasta.py --dbkey=$dbkey --species=$maf_source_type.species --mafSource=$maf_source_type.maf_file --interval_file=$input1 --output_file=$out_file1 --chromCol=$input1_chromCol --startCol=$input1_startCol --endCol=$input1_endCol --strandCol=$input1_strandCol --mafSourceType=$maf_source_type.maf_source 4 #else:#interval_maf_to_merged_fasta.py --dbkey=$dbkey --species=$maf_source_type.species --mafSource=$maf_source_type.maf_identifier --interval_file=$input1 --output_file=$out_file1 --chromCol=$input1_chromCol --startCol=$input1_startCol --endCol=$input1_endCol --strandCol=$input1_strandCol --mafSourceType=$maf_source_type.maf_source 5 5 #end if 6 6 </command> tools/maf/maf_by_block_number.py
r740 r863 30 30 continue 31 31 try: 32 for count, min enumerate( bx.align.maf.Reader( open( input_maf_filename, 'r' ) ) ):32 for count, block in enumerate( bx.align.maf.Reader( open( input_maf_filename, 'r' ) ) ): 33 33 if count == block_wanted: 34 maf_writer.write( m)34 maf_writer.write( block ) 35 35 break 36 36 except: tools/maf/maf_stats.py
r742 r863 5 5 """ 6 6 7 import sys , tempfile, os7 import sys 8 8 import pkg_resources; pkg_resources.require( "bx-python" ) 9 import bx.align.maf10 9 import bx.intervals.io 11 import bx.interval_index_file12 import psyco_full13 10 from numpy import zeros 14 15 MAF_LOCATION_FILE = "/depot/data2/galaxy/maf_index.loc" 16 17 def maf_index_by_uid( maf_uid ): 18 for line in open( MAF_LOCATION_FILE ): 19 try: 20 #read each line, if not enough fields, go to next line 21 if line[0:1] == "#" : continue 22 fields = line.split('\t') 23 if maf_uid == fields[1]: 24 try: 25 maf_files = fields[3].replace( "\n", "" ).replace( "\r", "" ).split( "," ) 26 return bx.align.maf.MultiIndexed( maf_files, keep_open = True, parse_e_rows = False ) 27 except Exception, e: 28 raise 'MAF UID (%s) found, but configuration appears to be malformed: %s' % ( maf_uid, e ) 29 except: 30 pass 31 return None 32 33 #builds and returns (index, index_filename) for specified maf_file 34 def build_maf_index( maf_file, species = None ): 35 indexes = bx.interval_index_file.Indexes() 36 try: 37 maf_reader = bx.align.maf.Reader( open( maf_file ) ) 38 # Need to be a bit tricky in our iteration here to get the 'tells' right 39 while True: 40 pos = maf_reader.file.tell() 41 block = maf_reader.next() 42 if block is None: break 43 for c in block.components: 44 if species is not None and c.src.split( "." )[0] not in species: 45 continue 46 indexes.add( c.src, c.forward_strand_start, c.forward_strand_end, pos ) 47 fd, index_filename = tempfile.mkstemp() 48 out = os.fdopen( fd, 'w' ) 49 indexes.write( out ) 50 out.close() 51 return ( bx.align.maf.Indexed( maf_file, index_filename = index_filename, keep_open = True, parse_e_rows = False ), index_filename ) 52 except: 53 return ( None, None ) 54 11 import maf_utilities 55 12 56 13 def __main__(): … … 74 31 if maf_source_type == "user": 75 32 #index maf for use here 76 index, index_filename = build_maf_index( input_maf_filename, species = [dbkey] )33 index, index_filename = maf_utilities.build_maf_index( input_maf_filename, species = [dbkey] ) 77 34 if index is None: 78 35 print >>sys.stderr, "Your MAF file appears to be malformed." … … 80 37 elif maf_source_type == "cached": 81 38 #access existing indexes 82 index = maf_ index_by_uid( input_maf_filename )39 index = maf_utilities.maf_index_by_uid( input_maf_filename ) 83 40 if index is None: 84 41 print >> sys.stderr, "The MAF source specified (%s) appears to be invalid." % ( input_maf_filename ) … … 99 56 coverage = { dbkey: zeros( region.end - region.start, dtype = bool ) } 100 57 101 blocks = index.get( src, region.start, region.end ) 102 for maf in blocks: 58 for block in maf_utilities.get_chopped_blocks_for_region( index, src, region, force_strand='+' ): 103 59 #make sure all species are known 104 for c in maf.components:60 for c in block.components: 105 61 spec = c.src.split( '.' )[0] 106 62 if spec not in coverage: coverage[spec] = zeros( region.end - region.start, dtype = bool ) 107 #slice maf by start and end 108 ref = maf.get_component_by_src( src ) 109 # If the reference component is on the '-' strand we should complement the interval 110 if ref.strand == '-': 111 maf = maf.reverse_complement() 112 ref = maf.get_component_by_src( src ) 113 slice_start = max( region.start, ref.start ) 114 slice_end = min( region.end, ref.end ) 115 try: 116 maf = maf.slice_by_component( ref, slice_start, slice_end ) 117 except: 118 continue 119 ref = maf.get_component_by_src( ref.src ) 120 63 ref = block.get_component_by_src( src ) 121 64 #skip gap locations due to insertions in secondary species relative to primary species 122 start_offset = slice_start - region.start65 start_offset = ref.start - region.start 123 66 num_gaps = 0 124 67 for i in range( len( ref.text.rstrip().rstrip( "-" ) ) ): … … 127 70 continue 128 71 #Toggle base if covered 129 for comp in maf.components:72 for comp in block.components: 130 73 spec = comp.src.split( '.' )[0] 131 74 if comp.text and comp.text[i] not in ['-']: … … 150 93 out.close() 151 94 print "%i regions were processed with a total length of %i." % ( num_region, total_length ) 152 if index_filename is not None:153 os.unlink( index_filename ) 95 maf_utilities.remove_temp_index_file( index_filename ) 96 154 97 if __name__ == "__main__": __main__() tools/maf/maf_thread_for_species.py
r761 r863 42 42 m.score = 0.0 43 43 maf_writer.write( m ) 44 except :45 print >> sys.stderr, "Error steping through MAF File "44 except Exception, e: 45 print >> sys.stderr, "Error steping through MAF File: %s" % e 46 46 sys.exit() 47 47 maf_reader.close() tools/maf/maf_to_bed.py
r740 r863 4 4 Read a maf and output intervals for specified list of species. 5 5 """ 6 7 from __future__ import division 8 9 import textwrap 10 import sys, tempfile, os 6 import sys, os 11 7 import pkg_resources; pkg_resources.require( "bx-python" ) 12 8 from bx.align import maf tools/maf/maf_to_fasta_concat.py
r740 r863 7 7 """ 8 8 #Dan Blankenberg 9 from __future__ import division10 11 import textwrap12 9 import sys 13 10 import pkg_resources; pkg_resources.require( "bx-python" ) 14 11 from bx.align import maf 12 import maf_utilities 15 13 16 14 def __main__(): … … 18 16 19 17 texts = {} 20 18 21 19 input_filename = sys.argv[2] 22 20 output_filename = sys.argv[3] … … 24 22 25 23 if "None" in species: 26 species = get_species( input_filename )24 species = maf_utilities.get_species_in_maf( input_filename ) 27 25 28 26 file_out = open( output_filename, 'w' ) … … 30 28 file_out.write( ">" + spec + "\n" ) 31 29 try: 32 for min maf.Reader( open( input_filename, 'r' ) ):33 c = m.get_component_by_src_start( spec )34 if c : file_out.write( c.text )30 for block in maf.Reader( open( input_filename, 'r' ) ): 31 component = block.get_component_by_src_start( spec ) 32 if component: file_out.write( component.text ) 35 33 else: file_out.write( "-" * m.text_size ) 36 34 except: … … 40 38 file_out.close() 41 39 42 def get_species( maf_filename ):43 try:44 species={}45 46 file_in = open( maf_filename, 'r' )47 maf_reader = maf.Reader( file_in )48 49 for i, m in enumerate( maf_reader ):50 l = m.components51 for c in l:52 spec, chrom = maf.src_split( c.src )53 if not spec or not chrom:54 spec = chrom = c.src55 species[spec] = spec56 57 file_in.close()58 59 species = species.keys()60 species.sort()61 return species62 except:63 return []64 40 65 41 if __name__ == "__main__": __main__() tools/maf/maf_to_fasta_multiple_sets.py
r740 r863 5 5 """ 6 6 #Dan Blankenberg 7 from __future__ import division8 9 import textwrap10 7 import sys 11 8 import pkg_resources; pkg_resources.require( "bx-python" ) … … 14 11 def __main__(): 15 12 print "Restricted to species:", sys.argv[3] 16 13 17 14 input_filename = sys.argv[1] 18 15 output_filename = sys.argv[2] … … 27 24 file_out = open( output_filename, 'w' ) 28 25 29 block_num = -1 30 31 for i, m in enumerate( maf_reader ): 32 block_num += 1 26 for block_num, block in enumerate( maf_reader ): 33 27 if "None" not in species: 34 m = m.limit_to_species( species ) 35 l = m.components 36 if len(l) < num_species and partial == "partial_disallowed": continue 37 for c in l: 38 spec, chrom = maf.src_split( c.src ) 28 block = block.limit_to_species( species ) 29 if len( block.components ) < num_species and partial == "partial_disallowed": continue 30 for component in block.components: 31 spec, chrom = maf.src_split( component.src ) 39 32 if not spec or not chrom: 40 spec = chrom = c .src41 file_out.write( ">" + c .src + "(" + c.strand + "):" + str( c.start ) + "-" + str( c.end ) + "|" + spec + "_" + str( block_num ) + "\n" )42 file_out.write( c .text + "\n" )33 spec = chrom = component.src 34 file_out.write( ">" + component.src + "(" + component.strand + "):" + str( component.start ) + "-" + str( component.end ) + "|" + spec + "_" + str( block_num ) + "\n" ) 35 file_out.write( component.text + "\n" ) 43 36 file_out.write( "\n" ) 44 37 file_in.close()