###Gene_Info_Comments GLEAN3_10593 ###
This genes spans three GLEAN predictions: GLEAN3_10593 + GLEAN3_13086 + GLEAN3_03874
The GLEAN3_13086 and GLEAN3_03874 predictions are overlapping (exons 3 and 4); gene duplication probably due to assembly
3'UTR of this gene is missing

Exon 	Start 	Stop 	Scaffold
1	24676	24229	23709
1?	2159	1712	12273
2	13236	13078	23709
3	95271	95968	1679
3?	5804	5107	38156
4	96341	96594	1679
4?	4742	4489	38156
5	106938	107067	1679
6	109648	109828	1679
7	110254	110486? 1679	3'UTR missing
Database version 2005/07/18
###Gene_Info_Comments GLEAN3_05979 ###
This gene spans two GLEAN predictions: GLEAN3_05979 + GLEAN3_03084

Exon 	Start 	Stop 	Scaffold
1	58051	57408	1179
1?	54625	55052?  21642	incomplete
2	19833	19675	1179
3	16229	16386	1201
4	17471	17710	1201
5	18192	18357	1201
6	31949	32079	1201
7	40928	41191	1201
8	41532	41731	1201
9	42426	42601	1201
10	43112	44008	1201
Database version 2005/07/18
###Gene_Info_Comments GLEAN3_03084 ###
This gene spans two GLEAN predictions: GLEAN3_05979 + GLEAN3_03084

Exon 	Start 	Stop 	Scaffold
1	58051	57408	1179
1?	54625	55052?  21642	incomplete
2	19833	19675	1179
3	16229	16386	1201
4	17471	17710	1201
5	18192	18357	1201
6	31949	32079	1201
7	40928	41191	1201
8	41532	41731	1201
9	42426	42601	1201
10	43112	44008	1201
Database version 2005/07/18
###Gene_Info_Comments GLEAN3_13015 ###
this is one haplotype the other is GLEAN3_20451
###Gene_Info_Comments GLEAN3_28275 ###
This gene spans two GLEAN predictions: GLEAN3_02473 + GLEAN3_28275
The GLEAN3_02473 prediction appears to include two different genes, a helicase/ zinc finger protein and the first two Sp-osteonectin exons

Exon 	Start 	Stop 	Scaffold
1	27772	27363	56445
2       17533   17392   56445
3       19999   20241   2764
4       21405   21615   2764
5       23776   23924   2764
6       26489   26684   2764
Database version 2005/07/18
###Gene_Info_Comments GLEAN3_01893 ###
This model was identified in part on the basis of PFAM Lectin_C domains.  It has alternating Lectin_c and Fn3 domains.
###Gene_Info_Comments GLEAN3_09885 ###
THis gene was identified partially on the basis of the PFAM domains included in the model
###Gene_Info_Comments GLEAN3_03825 ###
added 3'UTR based on EST evidence
###Gene_Info_Comments GLEAN3_14594 ###
5' and 3' UTRs added based on EST evidence

This and paralog GLEAN3_22537 are urchin-specific homologs of metazoan 14-3-3 proteins. It is likely that the ancestral metazoan had 2 14-3-3 proteins. One epsilon ortholog (GLEAN3_03825) and one other. This other has undergone differential expansion in vertebrates, nematodes, insects and echinoderms.
###Gene_Info_Comments GLEAN3_22537 ###
5' and 3' UTRs added based on EST evidence

This and paralog GLEAN3_22537 are urchin-specific homologs of metazoan 14-3-3 proteins. It is likely that the ancestral metazoan had 2 14-3-3 proteins. One epsilon ortholog (GLEAN3_03825) and one other. This other has undergone differential expansion in vertebrates, nematodes, insects and echinoderms.
###Gene_Info_Comments GLEAN3_06917 ###
Note that the 3' end of this gene (encoding the C-terminus of the protein) is contained on Scaffold118064, and that there is some overlap between the sequence at the 5' end of that scaffold and the 3' end of Scaffold874, which contains the 5' end of the gene (encoding the N-terminus).
###Gene_Info_Comments GLEAN3_07852 ###
The existence of this second Runx gene was predicted based on  Southern genomic blot analysis (Coffman et al., Dev. Biol. 174 (1), 43-54 (1996)).  No evidence for expression in the embryo.
###Gene_Info_Comments GLEAN3_23469 ###
Partial gene model. Single exon in the scaffold matches part of GLEAN3_01638. Possible assembly problem
###Gene_Info_Comments GLEAN3_10203 ###
Also found in scaffolds 113729, 14987, and 2038
###Gene_Info_Comments GLEAN3_10876 ###
Based on comparison with the cloned orthologue from Lytechinus pictus (LpPKC1, acc. no. U02967), this Glean3 prediction encodes the C-terminal half of the protein (beginning with exon 8), corresponding to nucleotides 1108-2177 of the LpPKC1 cDNA (note that the full length CDS entered here is that of the Sp homologue, which was cloned by RT-PCR in our lab using primers based on manual assembly of the gene using EST sequences, genomic trace sequences, and LpPKC1 as a scaffold; this sequence has not yet been deposited at NCBI).  There appears to be missing sequence corresponding to at least one exon, between exon 9 and the C-terminal two exons; some of this falls on a very short scaffold (Scaffold131377). The sequence encoding the N-terminal half (exons 1-7) is contained on Scaffold161 (in gene models Glean3_01048 and Glean3_01049), corresponding to nucleotides 1-1107 of LpPKC1.
###Gene_Info_Comments GLEAN3_13522 ###
There is a missing chunk of amino acids 1-125 that is probably most likely on another Scaffod and 312-382 That is probably lost in sequence of 
NNNNNNN
###Gene_Info_Comments GLEAN3_07485 ###
3' utr extended information added from the spline est data
###Gene_Info_Comments GLEAN3_13414 ###
the C-terminus of this known gene has its own Glean3 number and prediction (Glean3_14515); combined predictions were annotated in this entry
###Gene_Info_Comments GLEAN3_14515 ###
this glean model is a c-terminus of another glean-defined model (gleas3_13414); all information will be entered with that glean annotation
###Gene_Info_Comments GLEAN3_25428 ###
EST data from cleavage-blastula
###Gene_Info_Comments GLEAN3_14131 ###
this is clearly an ortholog of LvNotch along its entire length.  Therefore the expression patterns which are known for that species are shown in the embryo expression series
###Gene_Info_Comments GLEAN3_03898 ###
PROBLEM: in the scaffold200, where this prediction resides, regions containing exons 4 and 5 are duplicated: potential assembly problem!!! this messed up the original prediction quite badly.
###Gene_Info_Comments GLEAN3_26277 ###
Reference:
Ferkowicz,M.J., Stander,M.C. and Raff,R.A.
Phylogenetic relationships and developmental expression of three sea urchin Wnt genes
Mol. Biol. Evol. 15 (7), 809-819 (1998)
###Gene_Info_Comments GLEAN3_11756 ###
Reference:
Ferkowicz,M.J., Stander,M.C. and Raff,R.A.
Phylogenetic relationships and developmental expression of three sea urchin Wnt genes
Mol. Biol. Evol. 15 (7), 809-819 (1998)
###Gene_Info_Comments GLEAN3_26371 ###
Exon 9-13 of this gene are located on another scaffold (scaffold 73533 with GLEAN_09526 gene model).

###Gene_Info_Comments GLEAN3_24792 ###
potential duplication of the scaffold region, Glean3_24793 prediction has exactly 100% identical exons
###Gene_Info_Comments GLEAN3_24793 ###
the prediction is incomplete; another prediction on the same scaffold is longer and contains the regions within this glean as if duplicated
###Gene_Info_Comments GLEAN3_10040 ###
the Glean only described the C-terminal part of the gene;
human ortholog Accession Number is NP_892023
###Gene_Info_Comments GLEAN3_20371 ###
Reference:
Wikramanayake,A.H., Peterson,R., Chen,J., Huang,L., Bince,J.M., McClay,D.R. and Klein,W.H.
Nuclear beta-catenin-dependent Wnt8 signaling in vegetal cells of the early sea urchin embryo regulates gastrulation and differentiation of endoderm and mesodermal cell lineages
Genesis 39 (3), 194-205 (2004)
###Gene_Info_Comments GLEAN3_23952 ###
The 3' end of the Guisti et al. mRNA sequence is not part of any known Abl protein.  It is probably UTR, but it has specific hits on two Scaffolds, # 41 and 1211. 
###Gene_Info_Comments GLEAN3_10021 ###
Histidine Decarboxylase
###Gene_Info_Comments GLEAN3_03345 ###
See putative conserved domains at:
http://www.ncbi.nlm.nih.gov/Structure/cdd/wrpsb.cgi?RID=1137536970-15530-214731484586.BLASTQ1
###Gene_Info_Comments GLEAN3_19261 ###
This is a partial sequence.  It appears to be the N-terminal region of Fmi2.  Another glean sequence, Glean_09215, contains this region plus the C-terminal sequence either of this gene or a duplicate form of this gene.  
###Gene_Info_Comments GLEAN3_09526 ###
Exon 1-8 of this gene are located on another scaffold (scaffold 72753 with GLEAN3_26371 gene model).
Please refer to GLEAN3_26371 for complete modified gene model.
###Gene_Info_Comments GLEAN3_12374 ###
Missing 5' end, no start site 
###Gene_Info_Comments GLEAN3_18391 ###
5' partial 
GLEAN_18392 belongs to 3' end of SpTbx20
###Gene_Info_Comments GLEAN3_20345 ###
5' partial
GLEAN_20346 belongs to 3' end of SpTbx6/16 
###Gene_Info_Comments GLEAN3_24946 ###
5' partial 
GLEAN3_24947 belongs to 3' end of SpWntA
###Gene_Info_Comments GLEAN3_15341 ###
POSSIBLE DUPLICATES glean3_12295, 03996.
###Gene_Info_Comments GLEAN3_12294 ###
contains part of a peptidiase M16 inactive domain.  looks like part of an insulin degrading enzyme
###Gene_Info_Comments GLEAN3_23463 ###
5' partial
GLEAN3_23065& 24669 (identical duplicated seq) belongs to 3' end of SpWnt4 

Reference:
Ferkowicz,M.J., Stander,M.C. and Raff,R.A.
Phylogenetic relationships and developmental expression of threesea urchin Wnt genes
Mol. Biol. Evol. 15 (7), 809-819 (1998)
###Gene_Info_Comments GLEAN3_26099 ###
Identical to cDNA cloned in our lab
###Gene_Info_Comments GLEAN3_17534 ###
This gene is on two scaffolds (441 and 84105). On scaffold441, two GLEAN models are predicted for this gene (GLEAN_17533 for exon 1-8 and GLEAN_17534 for exon 9-16). On scafold 84105, there is one GLEAN model (GLEAN3_17719 for exon 17-22) prediceted for this gene.
Please refer to GLEAN_17533 for refined gene features.
###Gene_Info_Comments GLEAN3_17533 ###
This gene is on two scaffolds (441 and 84105). On scaffold441, two GLEAN models are predicted for this gene (GLEAN_17533 for exon 1-8 and GLEAN_17534 for exon 9-16). On scafold 84105, there is one GLEAN model (GLEAN3_17719 for exon 17-22) prediceted for this gene.
###Gene_Info_Comments GLEAN3_19715 ###
Amino Acids 46-369 of Sp-Mapk match with part of glean3_19715 prediction. Amino acides 1-45 of Sp-Mapk match with supertig69362_2 on scaffold 69362
###Gene_Info_Comments GLEAN3_17719 ###
This gene is on two scaffolds (441 and 84105). On scaffold441, two GLEAN models are predicted for this gene (GLEAN_17533 for exon 1-8 and GLEAN_17534 for exon 9-16). On scafold 84105, there is one GLEAN model (GLEAN3_17719 for exon 17-22) prediceted for this gene.
Please refer to GLEAN_17533 for refined gene features.
###Gene_Info_Comments GLEAN3_17790 ###
Highly homologous to a cDNA cloned in Sphaerechinus granularis (AJ841701)
###Gene_Info_Comments GLEAN3_19369 ###
Near-complete annotation for Rendezvin.  Also includes sequences from Scaffolds 102292 and 92747 (GLEAN3_27023).
###Gene_Info_Comments GLEAN3_00668 ###
the predicted protein sequence bears 2 AA substitution (107H->N - 203L->V) compared to S.Purp_Univin previously published (AAA57553).
This might reflect polymorphism.

###Gene_Info_Comments GLEAN3_24859 ###
partial prediction corresponding to the N-terminal end of eIF4G. In this scaffold, exon 21 stops at position 39027 (instead of  39057), and  exon 22 (not predicted) spans nt 39579 to the end of the scaffold. GLEAN3_19064 corresponds to the C-terminus counterpart,there is a stretch of Ns between exon 1 and 2 of this prediction, these exons correspond respectively to exons 23 and 25, the missing exon (#24) is in scaffold155969 (=glean3_12286). Exon 17 is duplicated (scaffold137842, glean3_06085). 
The modified gene model is entered in the gene features and is highly homologous to a Sphaerechinus granularis cDNA. 
###Gene_Info_Comments GLEAN3_20330 ###
Only the last exon of GLEAN3_20330 belongs to SpGrm, and a big part of the predicted ORF is DNA mismatch repair protein Mlh3, which has a duplicated seq in genome
###Gene_Info_Comments GLEAN3_20678 ###
I do not have any experimental evidence, but based on alignments with human and mouse Nanos1 orthologs I feel this GLEAN prediction may not be correct. I predict that the gene is only one exon and has the following sequence:

ATGGAGACATCTTCTTGGGATCTTTTCATGGGGAAAGGGTTGAACCTCAGTGAGATCATTTCTTCGACAAGCTGGAAAACTCCTCCAACCATGGCCATGCCACAACATTCACCAGCGATGTGGCCATCATCTCCGTGCCCATCGCCGCCTATGTCTCCATGGCCAGCTTTATCTCCCCCTATGTCTCCATGGCCAGCTCTATCTCCCTCAAGCACCGTACCACCATCAGCTTCACCACCACCATCAGCATCATCATCGCCGCATGAAGATGAGTTGATATTTCGATCCAGCTTTACCGACACCCTATCTGTCTCTTATGAGAAGAAGCGATACCTCAACACTTACTGCGTGTTCTGTAAGAACAACAAAGAAACTCTTTGCTTCTACAGCTCTCATGTCCTGAAGGATGATTTGGGGAACGTTCAATGTCCTGTTCTTAGGGCTTACAAGTGTCCTATTTGTGGGGCGAAGGGTGATAATGCGCACACCGTCAAGTATTGTCCTCAAAATTCCAGTTCATCAAAAGCCGAGAAGCTGACCAAATCATCAGGTTGCTGGTCGGATTACCCATCACCCCCGGGATTTTTTTAA
###Gene_Info_Comments GLEAN3_26559 ###
I don't have any experimental evidence, but this gene may not be annotated correctly. There is a large insertion in the middle of the protein that is not found in the human, mouse, fly, or worm orthologs.
###Gene_Info_Comments GLEAN3_27826 ###
Tiling microarray data predicts additional expressed tag between exons 3 and 4, but it's not present in the est for this same gene. This gene is homologous to the isoform 5 of mammalian Gbeta, but since urchin has only 1 other isoform (homologous to Gbeta1-4, and named "a"), this one is "b".
###Gene_Info_Comments GLEAN3_16157 ###
This gene spans two GLEAN predictions: GLEAN3_18353 + GLEAN3_16157
The GLEAN3_16157 prediction is contained within GLEAN3_03874 (exons 3 and 4); gene duplication probably due to assembly and/or haplotype

Exon 	Start 	Stop 	Scaffold
1	48765	48989	773
2       51585   51683   773
3       61327   61444   773
3?      41672   41789   85877
4       62157   62392   773
4?      42315   42550   85877
5       63630   63768   773
6       64653   64771   773
7       65224   65337   773
8       65903   66014   773
9       66514   66719   773
10      67327   67588   773
11      68104   68246   773
12      68855   68980   773
13	71135   71325   773	3'UTR missing
Database version 2005/07/18
###Gene_Info_Comments GLEAN3_15335 ###
Deleted 5' exon (16,057-16,235), not present in known cDNAs.
###Gene_Info_Comments GLEAN3_03874 ###
This genes spans three GLEAN predictions: GLEAN3_10593 + GLEAN3_13086 + GLEAN3_03874
The GLEAN3_13086 and GLEAN3_03874 predictions are overlapping (exons 3 and 4); gene duplication probably due to assembly
3'UTR of this gene is missing

Exon 	Start 	Stop 	Scaffold
1	24676	24229	23709
1?	2159	1712	12273
2	13236	13078	23709
3	95271	95968	1679
3?	5804	5107	38156
4	96341	96594	1679
4?	4742	4489	38156
5	106938	107067	1679
6	109648	109828	1679
7	110254	110486? 1679	3'UTR missing
Database version 2005/07/18
###Gene_Info_Comments GLEAN3_13086 ###
This genes spans three GLEAN predictions: GLEAN3_10593 + GLEAN3_13086 + GLEAN3_03874
The GLEAN3_13086 and GLEAN3_03874 predictions are overlapping (exons 3 and 4); gene duplication probably due to assembly
3'UTR of this gene is missing

Exon 	Start 	Stop 	Scaffold
1	24676	24229	23709
1?	2159	1712	12273
2	13236	13078	23709
3	95271	95968	1679
3?	5804	5107	38156
4	96341	96594	1679
4?	4742	4489	38156
5	106938	107067	1679
6	109648	109828	1679
7	110254	110486? 1679	3'UTR missing
Database version 2005/07/18
###Gene_Info_Comments GLEAN3_18505 ###
This is just a C-terminus of expected Gbeta cDNA. Since it looks like mammalian betas 1-4, it is designated A. Number 3 refers to the fact that it is a C-terminal piece of 3 pieces found. 3'UTR might extend until nt 8,000 by chip data.
###Gene_Info_Comments GLEAN3_00199 ###
Intronless Toll-like receptor with predicted secretory signal peptide, LRR-NT, LRR(22), LRR-CT, TM and TIR.  
This is a member of sea urchin-specific Tlr Group IA.
###Gene_Info_Comments GLEAN3_21816 ###
Added 3' exon (7107-7167) and modified next exon (8251-8350)to agree with known cDNA.
###Gene_Info_Comments GLEAN3_08526 ###
Middle part of the urchin GbetaA (hence A2). Further sequence located on scaffold59878, and "predicted" as Glean3_18505. Two N-terminal exons exactly match sequences from another scaffold27704, which contains more N-terminal regions of what I think is the same gene. Prediction is modified to match est data.
###Gene_Info_Comments GLEAN3_17106 ###
Part of the cDNA sequence (AY130972 ) could not be found anywhere in the genome. Therefore, accept gene model prediction for exon 2.

Exon 1-3 of this gene are on Scaffold 87957 (GLEAN3_17106). Exon 3-7 are on Scaffold 107218 (GLEAN3_25601). The two scaffolds have exon 3 and flanking sequences overlap and can be aligned quite well (e=0). Therefore, the two scaffold might be assembled as one scaffold. The modified gene model is composed of Exon 1-3 from Scaffold 87957 and exon 4-7 from scaffold 107218. 3' UTR is extended based on the EST data.

Exon 5-7 are also on Scaffold 55268 (GLEAN3_05718). This might be another allele of this gene.
###Gene_Info_Comments GLEAN3_08194 ###
daz homolog
###Gene_Info_Comments GLEAN3_18353 ###
This gene spans two GLEAN predictions: GLEAN3_18353 + GLEAN3_16157
The GLEAN3_16157 prediction is contained within GLEAN3_03874 (exons 3 and 4); gene duplication probably due to assembly and/or haplotype

Exon 	Start 	Stop 	Scaffold
1	48765	48989	773
2       51585   51683   773
3       61327   61444   773
3?      41672   41789   85877
4       62157   62392   773
4?      42315   42550   85877
5       63630   63768   773
6       64653   64771   773
7       65224   65337   773
8       65903   66014   773
9       66514   66719   773
10      67327   67588   773
11      68104   68246   773
12      68855   68980   773
13	71135   71325   773	3'UTR missing
Database version 2005/07/18
###Gene_Info_Comments GLEAN3_14573 ###
Highest homology to CYP3 family genes. Phylogenetically falls at base of CYP3 family, within CYP clan 3 (MrBayes v3.0.1 WAG+gamma+I model). Percent ID (on masked basis) to CYP3A4 (human) of 43.8% (aa level).
###Gene_Info_Comments GLEAN3_12253 ###
5' and 3' UTR are extended based on EST and expression data.
###Gene_Info_Comments GLEAN3_00436 ###
Highest homology to CYP3 family genes. Phylogenetically falls at base of CYP3 family, within CYP clan 3 (MrBayes v3.0.1 WAG+gamma+I model). Percent ID (on masked basis) to CYP3A4 (human) of 40.8% (aa level).
###Gene_Info_Comments GLEAN3_21555 ###
This gene encodes a precursor for seven putative SALMFamide neuropeptides (see Elphick & Thorndyke, 2005, J. Exp. Biol. 208, 4273-4282.)
GLEAN predicts 3 exons. However, the second exon encodes a signal peptide (see above paper), which is always located at the N-terminus of neuropeptide precursors. Moreover, the tiling data do not show up a signal for exon 1 of the GLEAN prediction. Therefore, I think it is likely that the GLEAN prediction is wrong and should be changed so that the predicted CDS is derived only from the 2nd and 3rd exons of the GLEAN prediction. The tiling data also show signals for sequences located 5' and 3' to the two CDS exons; these may correspond to UTRs but this needs to be confirmed by EST/cDNA sequencing.
###Gene_Info_Comments GLEAN3_15450 ###
orb-like, similar to cytoplasmic polyadenylation binding protein 1
###Gene_Info_Comments GLEAN3_07822 ###
most of the N-terminal part of the glean prediction seems inaccurate.
###Gene_Info_Comments GLEAN3_00615 ###
Intronless Toll-like receptor with predicted secretory signal peptide, LRR-NT, LRR(15 to 23), LRR-CT, TM and TIR.  
###Gene_Info_Comments GLEAN3_00911 ###
Intronless Toll-like receptor with predicted secretory signal peptide, LRR-NT, LRR(22), LRR-CT, TM and TIR. 
This is a member of sea urchin-specific Tlr Group IA. 
###Gene_Info_Comments GLEAN3_02442 ###
Intronless Toll-like receptor with predicted secretory signal peptide, LRR(22), LRR-CT, TM and TIR. This is a member of sea urchin-specific Tlr Group I(orphan). 
###Gene_Info_Comments GLEAN3_02538 ###
Intronless Toll-like receptor with predicted secretory signal peptide, LRR-NT, LRR(22), LRR-CT, TM and TIR. 
This is a member of sea urchin-specific Tlr Group IB. 
###Gene_Info_Comments GLEAN3_04139 ###
Intronless Toll-like receptor with predicted secretory signal peptide, LRR-NT, LRR(22), LRR-CT, TM and TIR.  
This is a member of sea urchin-specific Tlr Group IA.
###Gene_Info_Comments GLEAN3_04150 ###
Intronless Toll-like receptor with predicted secretory signal peptide, LRR-NT, LRR(22), LRR-CT, TM and TIR. 
This is a member of sea urchin-specific Tlr Group IA. 
###Gene_Info_Comments GLEAN3_04311 ###
Intronless Toll-like receptor with predicted secretory signal peptide, LRR-NT, LRR(22), LRR-CT, TM and TIR. 
This is a member of sea urchin-specific Tlr Group IE.

###Gene_Info_Comments GLEAN3_04360 ###
Intronless Toll-like receptor with predicted secretory signal peptide, LRR-NT, LRR(18 to 23), LRR-CT, TM and TIR.  
###Gene_Info_Comments GLEAN3_05088 ###
Intronless Toll-like receptor with predicted secretory signal peptide, LRR-NT, LRR(18 to 24), LRR-CT, TM and TIR.  
###Gene_Info_Comments GLEAN3_27278 ###
Amino Acid number 41 predicted from cloned cDNA is an alanine. Glean3 model predicts a T.

Glean3_12529 predicts the same ORF
###Gene_Info_Comments GLEAN3_05950 ###
Intronless Toll-like receptor with predicted secretory signal peptide, LRR-NT, LRR(11 to 22), LRR-CT, TM and TIR.
This is a member of sea urchin-specific Tlr Group IA.  
###Gene_Info_Comments GLEAN3_06164 ###
Intronless Toll-like receptor with predicted secretory signal peptide, LRR-NT, LRR(23), LRR-CT, TM and TIR. 
This is a member of sea urchin-specific Tlr Group IB.  
###Gene_Info_Comments GLEAN3_06458 ###
Intronless Toll-like receptor with predicted secretory signal peptide, LRR-NT, LRR(18 to 24), LRR-CT, TM and TIR.  
###Gene_Info_Comments GLEAN3_07790 ###
Intronless Toll-like receptor with predicted secretory signal peptide, LRR-NT, LRR(10 to 22), LRR-CT, TM and TIR.  
###Gene_Info_Comments GLEAN3_08278 ###
Intronless Toll-like receptor with predicted secretory signal peptide, LRR-NT, LRR(13 to 22), LRR-CT, TM and TIR.  
###Gene_Info_Comments GLEAN3_08396 ###
Intronless Toll-like receptor with predicted secretory signal peptide, LRR-NT, LRR(10 to 21), LRR-CT, TM and TIR.  
###Gene_Info_Comments GLEAN3_08456 ###
Intronless Toll-like receptor with predicted secretory signal peptide, LRR-NT, LRR(16 to 24), LRR-CT, TM and TIR.  
###Gene_Info_Comments GLEAN3_04599 ###
In the absence of ESTs or cDNA sequences from purpuratus, the 5 and 3' UTR regions have not been described here. 
###Gene_Info_Comments GLEAN3_08962 ###
Intronless Toll-like receptor with predicted secretory signal peptide, LRR-NT, LRR(22), LRR-CT, TM and TIR. 
This is a member of sea urchin-specific Tlr Group IB.  
###Gene_Info_Comments GLEAN3_08963 ###
Intronless Toll-like receptor with predicted secretory signal peptide, LRR-NT, LRR(22), LRR-CT, TM and TIR.  
This is a member of sea urchin-specific Tlr Group IB. 
###Gene_Info_Comments GLEAN3_09037 ###
Intronless Toll-like receptor with predicted secretory signal peptide, LRR-NT, LRR(22), LRR-CT, TM and TIR. 
This is a member of sea urchin specific Tlr Group IA.
###Gene_Info_Comments GLEAN3_20677 ###
the first exons are wrong predicted. starting with the third exon the prediction is correct.
use the accession number NP_999702.1 for the full lenght cDNA
###Gene_Info_Comments GLEAN3_09173 ###
Intronless Toll-like receptor with predicted secretory signal peptide, LRR-NT, LRR(22), LRR-CT, TM and TIR.
This is a member of sea urchin-specific Tlr Group I(orphan).

###Gene_Info_Comments GLEAN3_09435 ###
Intronless Toll-like receptor with predicted secretory signal peptide, LRR-NT, LRR(13 to 23), LRR-CT, TM and TIR.  
###Gene_Info_Comments GLEAN3_10575 ###
Intronless Toll-like receptor with predicted secretory signal peptide, LRR-NT, LRR(11 to 22), LRR-CT, TM and TIR.  
###Gene_Info_Comments GLEAN3_10695 ###
Intronless Toll-like receptor with predicted secretory signal peptide, LRR-NT, LRR(22), LRR-CT, TM and TIR. 
This is a member of sea urchin-specific Tlr Group IA. 
###Gene_Info_Comments GLEAN3_11537 ###
Intronless Toll-like receptor with predicted secretory signal peptide, LRR-NT, LRR(22), LRR-CT, TM and TIR.
This is a member of sea urchin-specific Tlr Group IA.
###Gene_Info_Comments GLEAN3_06988 ###
This gene was annotated based on a manual revision of multiple protein sequence alignments.
The predicted N-terminal SH2 domains align best with vertebrate Syk genes, whereas the predicted C-terminal tyrosine kinase domain aligns best with various vertebrate ZAP-70 tyrosine kinases.

There is some extra exons present in the NCBI prediction (XM_793943.1), however Glean3_06988 shows a better pairwise alignment to murine Syk. There is still a few of gaps in such alignment, which suggest there might be excess sequence in this glean model.
###Gene_Info_Comments GLEAN3_12257 ###
Intronless Toll-like receptor with predicted secretory signal peptide, LRR-NT, LRR(14 to 24), LRR-CT, TM and TIR.  
###Gene_Info_Comments GLEAN3_13470 ###
Intronless Toll-like receptor with predicted secretory signal peptide, LRR-NT, LRR(17 to 24), LRR-CT, TM and TIR.  
###Gene_Info_Comments GLEAN3_13824 ###
Intronless Toll-like receptor with predicted secretory signal peptide, LRR-NT, LRR(16 to 23), LRR-CT, TM and TIR.  
###Gene_Info_Comments GLEAN3_14041 ###
Intronless Toll-like receptor with predicted secretory signal peptide, LRR-NT, LRR(22), LRR-CT, TM and TIR.
This is a member of sea urchin-specific Tlr Group IB.   
###Gene_Info_Comments GLEAN3_14073 ###
Intronless Toll-like receptor with predicted secretory signal peptide, LRR-NT, LRR(15 to 22), LRR-CT, TM and TIR.  
###Gene_Info_Comments GLEAN3_14266 ###
Intronless Toll-like receptor with predicted secretory signal peptide, LRR-NT, LRR(22), LRR-CT, TM and TIR.
This is a member of sea urchin-specific Tlr Group IE.

###Gene_Info_Comments GLEAN3_15066 ###
Intronless Toll-like receptor with predicted secretory signal peptide, LRR-NT, LRR(22), LRR-CT, TM and TIR.  
This is a member of sea urchin-specific Tlr Group IA.
###Gene_Info_Comments GLEAN3_05171 ###
See also the latest tree of Hox affinities:

Cameron, R.A., Rowen, L., Nesbitt, R., Bloom, S., Rast, J.P., Berney, K., Arenas-Mena, C., Martinez, P., Lucas, S., Richardson, P.M., Davidson, E.H., Peterson, K. J. and Hood, L. (2005) Unusual gene order and organization of the sea urchin HOX cluster. J. Exp. Zoology. In press.
###Gene_Info_Comments GLEAN3_02634 ###
This GLEAN prediction only corresponds to the N-terminus of SpHox7. The homeodomain and C-terminus of this protein are predicted as GLEAN3_05170.
In fact Glean3_05170 contains the 2nd exon of the gene plus a misspreddicted miniexon.
###Gene_Info_Comments GLEAN3_15303 ###
Intronless Toll-like receptor with predicted secretory signal peptide, LRR-NT, LRR(8 to 21), LRR-CT, TM and TIR.  
This is a member of sea urchin-specific Tlr Group IA.
###Gene_Info_Comments GLEAN3_16457 ###
Intronless Toll-like receptor with predicted secretory signal peptide, LRR-NT, LRR(10 to 22), LRR-CT, TM and TIR.  
###Gene_Info_Comments GLEAN3_27685 ###
This gene shows significant similarity to vertebrate SH2-B family members. However, pairwise alignments also suggest there is additional sequence missing from this model. A better alignment to vertebrate SH2-B proteins can be reconstructed if two Fgenesh++ models located in two separate scaffolds are joined. The sequence continuity of these scaffolds is supported by genomic sequence alignments (scaffold83718-to-scaffold53583-to-scaffold1921). A putative N-ter region of this gene is still missing.
###Gene_Info_Comments GLEAN3_19309 ###
Intronless Toll-like receptor with predicted secretory signal peptide, LRR-NT, LRR(22), LRR-CT, TM and TIR. 
This is a member of sea urchin-specific Tlr Group IA. 
###Gene_Info_Comments GLEAN3_21420 ###
Intronless Toll-like receptor with predicted secretory signal peptide, LRR-NT, LRR(15 to 23), LRR-CT, TM and TIR.  
###Gene_Info_Comments GLEAN3_22911 ###
Intronless Toll-like receptor with predicted secretory signal peptide, LRR-NT, LRR(18 to 24), LRR-CT, TM and TIR.  
###Gene_Info_Comments GLEAN3_00388 ###
See also the latest tree of Hox affinities:

Cameron, R.A., Rowen, L., Nesbitt, R., Bloom, S., Rast, J.P., Berney, K., Arenas-Mena, C., Martinez, P., Lucas, S., Richardson, P.M., Davidson, E.H., Peterson, K. J. and Hood, L. (2005) Unusual gene order and organization of the sea urchin HOX cluster. J. Exp. Zoology. In press.
###Gene_Info_Comments GLEAN3_23035 ###
Intronless Toll-like receptor with predicted secretory signal peptide, LRR-NT, LRR(14 to 23), LRR-CT, TM and TIR.  
###Gene_Info_Comments GLEAN3_23321 ###
Intronless Toll-like receptor with predicted secretory signal peptide, LRR-NT, LRR(22), LRR-CT, TM and TIR.
This is a member of sea urchin-specific Tlr Group I(orphan).
 
###Gene_Info_Comments GLEAN3_27600 ###
cDNA sequence from another individual has been submitted to genebank (DQ082723).  The sequence differs from FgenesH prediction in simple sequence repeat region.    
###Gene_Info_Comments GLEAN3_24062 ###
Intronless Toll-like receptor with predicted secretory signal peptide, LRR-NT, LRR(11 to 22), LRR-CT, TM and TIR.  
###Gene_Info_Comments GLEAN3_24204 ###
Intronless Toll-like receptor with predicted secretory signal peptide, LRR-NT, LRR(22), LRR-CT, TM and TIR.  
This is a member of sea urchin-specific Tlr Group IC.
###Gene_Info_Comments GLEAN3_24205 ###
Intronless Toll-like receptor with predicted secretory signal peptide, LRR-NT, LRR(22), LRR-CT, TM and TIR.
This is a member of sea urchin-specific Tlr Group IC.  
###Gene_Info_Comments GLEAN3_00669 ###
extra exon on 5'end...
unclear duplication GLEAN_21497
###Gene_Info_Comments GLEAN3_24208 ###
Intronless Toll-like receptor with predicted secretory signal peptide, LRR-NT, LRR(22), LRR-CT, TM and TIR.  
This is a member of sea urchin-specific Tlr Group IC.
###Gene_Info_Comments GLEAN3_24385 ###
Intronless Toll-like receptor with predicted secretory signal peptide, LRR-NT, LRR(22), LRR-CT, TM and TIR.
This is a member of sea urchin-specific Tlr Group IE.

###Gene_Info_Comments GLEAN3_24429 ###
Intronless Toll-like receptor with predicted secretory signal peptide, LRR-NT, LRR(22), LRR-CT, TM and TIR.
This is a member of sea urchin-specific Tlr Group IE.

###Gene_Info_Comments GLEAN3_24731 ###
Intronless Toll-like receptor with predicted secretory signal peptide, LRR-NT, LRR(18 to 23), LRR-CT, TM and TIR.  
###Gene_Info_Comments GLEAN3_24733 ###
Intronless Toll-like receptor with predicted secretory signal peptide, LRR-NT, LRR(13 to 23), LRR-CT, TM and TIR.  
###Gene_Info_Comments GLEAN3_24868 ###
Intronless Toll-like receptor with predicted secretory signal peptide, LRR-NT, LRR(24), LRR-CT, TM and TIR. 
This is a member of sea urchin-specific Tlr Group IB.  
###Gene_Info_Comments GLEAN3_26200 ###
Intronless Toll-like receptor with predicted secretory signal peptide, LRR-NT, LRR(16 to 24), LRR-CT, TM and TIR.  
###Gene_Info_Comments GLEAN3_27162 ###
Intronless Toll-like receptor with predicted secretory signal peptide, LRR-NT, LRR(15 to 24), LRR-CT, TM and TIR.  
###Gene_Info_Comments GLEAN3_27164 ###
Intronless Toll-like receptor with predicted secretory signal peptide, LRR-NT, LRR(12 to 21), LRR-CT, TM and TIR.  
###Gene_Info_Comments GLEAN3_28639 ###
Intronless Toll-like receptor with predicted secretory signal peptide, LRR-NT, LRR(11 to 21), LRR-CT, TM and TIR.  
###Gene_Info_Comments GLEAN3_28893 ###
Intronless Toll-like receptor with predicted secretory signal peptide, LRR-NT, LRR(22), LRR-CT, TM and TIR.
This is a member of sea urchin-specific Tlr Group ID.  
###Gene_Info_Comments GLEAN3_25646 ###
Partial sequence.  DIX domain missing and some deletions (in comparison to the Lytechinus variegatus form).  There appears to be a mistake in sequencing.  Individuals who have sequenced this gene in S. purpuratus did not note these deletions.
###Gene_Info_Comments GLEAN3_13536 ###
Initial Glean3 model is missing exons 681-782 and 2024-2126 and has an incorrect, low complexity exon 2154-2787 as exon 1.  The 5' end of this cDNA is in GLEAN3_09788(Scaffold 1841)
###Gene_Info_Comments GLEAN3_13810 ###
This gene was annotated based on a manual revision of multiple protein sequence alignments.

Note: there are slight differences between the Glean and other predictions. None of them shows a significantly improved alignment to Pptn11.
###Gene_Info_Comments GLEAN3_25612 ###
Encodes the C-terminus of Glean3_06917 (see annotation to that gene).
###Gene_Info_Comments GLEAN3_01048 ###
This is a gene that falls on two different scaffolds.  See Glean3_10876 for full annotation.
###Gene_Info_Comments GLEAN3_01049 ###
This is part of a gene that is contained on multiple scaffolds.  See Glean3_10876 for full annotation.
###Gene_Info_Comments GLEAN3_09788 ###
This is the 5' end of the Sp-Alpha P subunit the completed annotation is on GLEAN3_13536

###Gene_Info_Comments GLEAN3_21303 ###
Actual Exon number 6 was missing in GLEAN3 prediction.
###Gene_Info_Comments GLEAN3_03911 ###
This gene was fused to an adjacent glean model (GLEAN3_03912) to obtain a full sequence that best aligns with vertebrate IL1AP (the last exon in the original version of this model was removed from the modified model). The DNA and protein sequences corresponding to the modified model are provided.
###Gene_Info_Comments GLEAN3_00428 ###
56 nucleotides encoding a predicted signal peptide were added at the 5'end of the GLEAN3 model by comparison to the corresponding FgeneshAB prediction. 

Intronless Toll-like receptor with predicted secretory signal peptide, LRR-NT, LRR(19 to 23), LRR-CT, TM and TIR.  
###Gene_Info_Comments GLEAN3_01877 ###
1381 nucleotides encoding a predicted signal peptide, LRRNT, LRRs were added at the 5'end of the GLEAN3 model by comparison to the corresonding FgeneshAB prediction. 
This is a member of sea urchin-specific Tlr Group IA.
###Gene_Info_Comments GLEAN3_03912 ###
A modified version of this model was fused to its adjacent glean model (GLEAN3_03911) based on manually revised sequence alignments to vertebrate IL1AP sequences.

See GLEAN3_03911 for more details, features and modified sequence.
###Gene_Info_Comments GLEAN3_21516 ###
Deleted 5' exon (6960-6973) not represented in known cDNAs. GLEAN3_21516 is missing N-terminal half of amino acid coding sequence reported in AF061750, AF036902.
###Gene_Info_Comments GLEAN3_10245 ###
This annotation is based on a manual revision of multiple protein sequence alignments.

Note that the protein encoded by this gene is shorter than that of its vertebrate homologs, which indicates there might be some N-terminal sequence missing from this model.
Similar predictions do not add a significant amount of sequence, nor do they improve the protein alignment to vertebrate SOCS7.
###Gene_Info_Comments GLEAN3_11298 ###
In a neighbor-joining tree based on multiple sequence alignments with vertebrate and fruit fly SOCS-related sequences, this gene does not co-group with any distinct vertebrate homolog. Because its sister group includes hSOCS6 and hSOCS7, and because the most significant Blast hit is to hSOCS6 is that we named this gene Sp-SOCS6-like; but it should be noted that this name only reflects its closest similarity to vertebrate SOCS6, and should not be taken as to reflect true orthology.

There are slightly different predictions for this gene, but none of them significantly improved its protein  alignment to vertebrate SOCS genes.
###Gene_Info_Comments GLEAN3_23261 ###
There is a missing Amino Acid 61-159
That is probably lost in sequence of 
NNNNNNN
###Gene_Info_Comments GLEAN3_02792 ###
Even though the best Blast hit for this gene is to hSOCS2, we named GLEAN3_02792 Sp-Socs2/3 because it seems similarly related to both as indicated by neighbor joining trees made from various SOCS-related sequences (it typically co-distributes with a sister group that includes both SOCS2 and SOCS3 and no other vertebrate SOCS genes).

The embryonic expression of this gene is supported by tiling array data.
###Gene_Info_Comments GLEAN3_20879 ###
Glean_20879 sequence incomplete in 5', completed with Glean_12710
###Gene_Info_Comments GLEAN3_02224 ###
75 nucleotides encoding a predicted signal peptide were added at the 5'end of the GLEAN3 model by comparison to the corresponding FgeneshAB prediction. 

Intronless Toll-like receptor with predicted secretory signal peptide, LRR-NT, LRR(14 to 24), LRR-CT, TM and TIR. 

###Gene_Info_Comments GLEAN3_26496 ###
Even though the best Blast hit for this gene is to vertebrate SOCS5, we named GLEAN3_02792 Sp-Socs4/5 because it seems similarly related to both as indicated by neighbor joining trees made from various SOCS-related sequences (it typically co-distributes with a sister group that includes both SOCS2 and SOCS3 and no other vertebrate SOCS genes).

Better Blast hits to this gene were found, but they all correspond to predicted sequences in various genomes. The accession number provided is to murine SOCS5.

Note that the protein encoded for this gene has a longer N-ter region than that encoded by its vertebrate counterparts. However, the embryonic tiling array data correlate well with the entire predicted coding exon, which argues against possible annotation mistakes.
###Gene_Info_Comments GLEAN3_19743 ###
located in an intron of predicted GLEAN3_19742 (a cysteine protease)
###Gene_Info_Comments GLEAN3_12710 ###
5'region of Glean_20879
###Gene_Info_Comments GLEAN3_03578 ###
54 nucleotides encoding a predicted signal peptide were added at the 5'end of the GLEAN3 model by comparison to the corresponding FgeneshAB prediction. 

Intronless Toll-like receptor with predicted secretory signal peptide, LRR-NT, LRR(12 to 23), LRR-CT, TM and TIR. 

###Gene_Info_Comments GLEAN3_03579 ###
54 nucleotides encoding a predicted signal peptide and 138 nucleotides encoding a part of TIR domain were added at the 5'end of the GLEAN3 model by comparison to the corresponding FgeneshAB prediction and BLASTN. 
This gene model may be a pseudogene.

###Gene_Info_Comments GLEAN3_14243 ###
Predicted amino acid sequence from exon 1 (N-terminus) is not present in mammalian homologs.
###Gene_Info_Comments GLEAN3_11539 ###
96 nucleotides encoding a predicted signal peptide were added at the 5'end of the GLEAN3 model by comparison to the corresonding FgeneshAB prediction. 

Intronless Toll-like receptor with predicted secretory signal peptide, LRR-NT, LRR(22), LRR-CT, TM and TIR. 
This is a member of sea urchin specific Tlr Group IA.
###Gene_Info_Comments GLEAN3_22112 ###
Sp-CSK spans two scaffolds 40333 and begins on the very small 127652.  Somehow the N-terminus half of the predicted sequence has a 7-transmembrane domain that DOES NOT BELONG!!!!  The origional scaffolds, above, contain the correct sequence predicted.
###Gene_Info_Comments GLEAN3_15533 ###
Intronless Toll-like receptor with predicted secretory signal peptide, LRR-NT, LRR(22), LRR-CT, TM and TIR. There are 50 unknown nucleotides (NNN) in the TIR domain.
This is a member of sea urchin specific Tlr Group IA.
###Gene_Info_Comments GLEAN3_07343 ###
The alignment between this protein and murine MyD88 is very strong, except for the N-terminus of GLEAN_07343, which is notably longer that that of vertebrate MyD88 genes. There are several internal Methionines in this glean model, which could account for this seemingly "extra" N-terminus sequence.

A partial duplication of this model was found in GLEAN3_07342.
###Gene_Info_Comments GLEAN3_18394 ###
1) alignment with best blast hits suggest glean model contains complete coding sequence.

2) There is an excellent, but short, match on scaffold76149_1, 1230 to 1344, which is not on the glean3 list.
###Gene_Info_Comments GLEAN3_07342 ###
The protein coded by this sequence is identical to part of the sequence coded by an adjacent Glean model (GLEAN3_07343), and might represent either a haplotype or assembly-originated duplication.
###Gene_Info_Comments GLEAN3_07004 ###
Actually two GLEAN3 predictions (GLEAN3_07004 & GLEAN3_17646) give the same "best genebank hit" (BMP11/GDF11_DanioRerio) 
but the 2 sequence divergence seems too high to be due to duplication.
GLEAN3_17646 was then annotated as "Sp-BMP11b"
###Gene_Info_Comments GLEAN3_17647 ###
Same best genbank hit than GLEAN3_07004 (GDF11/BMP11_DanioRerio).
but most likely not a duplication.
Arbitrarily called Sp-BMP11b
to differenciate from Sp-BMP11 (GLEAN3_07004)


###Gene_Info_Comments GLEAN3_17352 ###
See also the latest tree of Hox affinities:

Cameron, R.A., Rowen, L., Nesbitt, R., Bloom, S., Rast, J.P., Berney, K., Arenas-Mena, C., Martinez, P., Lucas, S., Richardson, P.M., Davidson, E.H., Peterson, K. J. and Hood, L. (2005) Unusual gene order and organization of the sea urchin HOX cluster. J. Exp. Zoology. In press.
###Gene_Info_Comments GLEAN3_17469 ###
A cap in this contig contains at least one exon for this gene.

###Gene_Info_Comments GLEAN3_02874 ###
Amino Acids 1-97 of Sp-Ets1-2 match with part of glean3_04409 prediction on Scaffold1036. Amino acides 
464-555 of Sp-Ets1-2 match with part of glean3_27053 prediction on Scaffold28371.
###Gene_Info_Comments GLEAN3_05170 ###
-This GLEAN prediction covers only the C-terminal part (second exon) of the known protein. The N-terminal part (first exon) is in GLEAN3_02634
-One GLEAN? predicted miniexon has been deleted. It doesn't appear in the known cDNA sequence.

-See also the latest tree of Hox affinities:

Cameron, R.A., Rowen, L., Nesbitt, R., Bloom, S., Rast, J.P., Berney, K., Arenas-Mena, C., Martinez, P., Lucas, S., Richardson, P.M., Davidson, E.H., Peterson, K. J. and Hood, L. (2005) Unusual gene order and organization of the sea urchin HOX cluster. J. Exp. Zoology. In press.
###Gene_Info_Comments GLEAN3_02680 ###
ESTs cover nearly the whole gene. 5' and 3' UTRs not annotated.
###Gene_Info_Comments GLEAN3_15378 ###
See GLEAN3_15377
###Gene_Info_Comments GLEAN3_15379 ###
See GLEAN3_15377
###Gene_Info_Comments GLEAN3_00547 ###
See GLEAN3_15377
###Gene_Info_Comments GLEAN3_15377 ###
GLEAN3_15377, 15378, 15379, and 106060 combine to form one gene.    15377, part of 15378, and 15379 are contiguous and all blast as alpha integrins and GLEAN3_00547 is an overlapping sequence on a small scaffold that includes the last 4 exons of the gene.  The start of translation and first approximately 50 aa not included.
###Gene_Info_Comments GLEAN3_09335 ###
Supported by 3 ESTs
###Gene_Info_Comments GLEAN3_03514 ###
The CDS is much longer than RhoA orthologs.  It is possible that the first exon(s) are not real CDS and that the real start site begins at bp 439 (based on sequence homology)
###Gene_Info_Comments GLEAN3_13815 ###
1st 3 exons are in GLEAN3_13814.  Junction between them is not correct.
###Gene_Info_Comments GLEAN3_13814 ###
This is the first 3 exons of the gene.  The rest is on GLEAN3_13815.  That sequence was annotated to have the correct sequence, except that an exon joining the two GLEANs seems to be missing.
###Gene_Info_Comments GLEAN3_11307 ###
Removed two incorrectly added exons.  One EST covers about 1 kb of the gene.
###Gene_Info_Comments GLEAN3_12694 ###
Also could be considered an ortholog of ABCD2.  The 3' end may be wrong.  The NCBI prediction gives a different C-terminus, but there are no ESTs to verify the correct one.
###Gene_Info_Comments GLEAN3_07530 ###
We have cloned Chk1 from S. pupruratus eggs and are in the process of modifying the model with the updated sequence. There appears to be another exon not identified in the model.
###Gene_Info_Comments GLEAN3_12805 ###
This SPU is part of the same gene as SPU_019224 (SpFrk), by fusing the two SPUs and aligning with HsFrk.  The two SPU sequences have been cloned out of egg cDNA (see mRNA sequence).
###Gene_Info_Comments GLEAN3_14051 ###
Only exons 4 and 5 are present on this scaffold1351. There is a huge gap of Ns (>100kb) where exons 1-3 could be located. However, exons 1-3 are present on scaffold21 GLEAN3_27234 but exons 4-5 are missing. 
###Gene_Info_Comments GLEAN3_24525 ###
cloned from egg cDNA
Glean sequence accepted, however further validation needed for sequence not matching the cloned mRNA sequence.
###Gene_Info_Comments GLEAN3_05957 ###
GLEAN3_05957 prediction contains 2 out of the 3 exons for Sp-4EBP. Exon 1 is located on Scaffold4246 (no GLEAN prediction). Coordinates 52018 to 54824 in GLEAN3_05957 prediction do not match 4EBP. 
New gene model as follows:
scaffold 4246 strand +  start 54530 stop 54673
scaffold 111348 strand -  start 51046 stop 51210
scaffold 111348 strand -  start 49781 stop 49815
The new gene model matches also a Sphaerechinus granularis partial cDNA (accession # AM161045). 

###Gene_Info_Comments GLEAN3_00826 ###
Glean3_00826 and glean3_00827 are very similar to previously cloned sea urchin SM30 genes.  A previously isolated genomic clone (Akasaka et al 1994, JBC 269: 20592-20598) indicates that glean3_00826 is most similar to the SM30-alpha gene. This is because of glean3_00826's intron size and its distance to the next downstream SM30 gene. Glean_00825, glean__00826, glean_00827, and glean_00828 encode SM30 like proteins and they are tandemly arranged on Scaffold25604.

Matches c-type lectin domain (cd00037).
###Gene_Info_Comments GLEAN3_00508 ###
Full gene of the GbetaA is annotated here. Further C-terminal sequences found in Glean3_08526 and Glean3_18505 are/will be added in. 2 (last) exons contained in this Glean match exactly with 2 (first) exons contained in Glean3_08526, so there might be a problem with contigs assembly.
Exon 5 might be alternatively spliced: it's present in some but not all ests.
###Gene_Info_Comments GLEAN3_00827 ###
Glean3_00826 and glean3_00827 are very similar to cloned sea urchin SM30 genes.  A previously isolated genomic clone (Akasaka et al 1994, JBC 269: 20592-20598) indicates that glean3_00827 is most similar to the partially cloned SM30-beta gene. Glean3_00827's intron size and its distance to glean3_00826 supports this conclusion. Glean_00825, glean__00826, glean_00827, and glean_00828 encode SM30-like proteins and they are tandemly arranged on Scaffold25604.

Matches c-type lectin domain (cd00037).
###Gene_Info_Comments GLEAN3_00828 ###
Glean3_00828 is similar to previously cloned S. purpuratus SM30-alpha but to a lesser extent than glean3-00825,glean3-00826, and glean3_00827.  Transcriptome data indicates that this gene is expressed. Glean3_00825, glean3__00826, glean3_00827, and glean3_00828 encode SM30-like proteins that are tandemly arranged on Scaffold25604.

Matches c-type lectin domain (cd00037).

Had to alter gene sequence.  Now has two exons instead of four.
###Gene_Info_Comments GLEAN3_02737 ###
See GLEAN3_01774, _09709. 
###Gene_Info_Comments GLEAN3_18204 ###
See GLEAN3_23216. 
###Gene_Info_Comments GLEAN3_18500 ###
Pulling up same GLEAN3 for dynactin isoform 2 (p50).
###Gene_Info_Comments GLEAN3_26748 ###
partial sequence
This incomplete model has been incorporated in Glean_3 24044 by Mariano Loza Coll (Toronto).
###Gene_Info_Comments GLEAN3_19711 ###
See GLEAN3_19710. 
###Gene_Info_Comments GLEAN3_04021 ###
This model was annotated based on a manual inspection of multiple protein sequence alignments.

Note that the best Blast hit to this model corresponds to a vertebrate Lim-Hox2 gene. However, there is no detectable homeodomain in GLEAN3_04021. An overlapping, larger Fgenesh++ model was inspected for a homeodomain prediction, albeit unsuccessfully. Note, however, that this model is located on a scaffold with several gaps between contigs; therefore, a full LIM-Hox gene may exist in this region and it simply was not picked up by the predictions.

Until better evidence becomes available, we have decided to name this gene Sp-Lim-containing1.
###Gene_Info_Comments GLEAN3_13569 ###
This model was annotated based on a manual inspection of multiple protein sequence alignments.

The best Blast hit for this model is vertebrate LMO2, which contains two LIM domains. This model, however, contains a single LIM domain, and is located in a scaffold that contains various gaps between contigs, for which it could as well represent an incomplete model. There are Fgenesh and Genscan models that, though slightly longer, are otherwise identical to GLEAN3_13569. The Genscan model codes for a protein that is almost the same length as LMO2, but that does not include additional LIM domains. For this reason, and until additional evidence becomes available, we have decided to name this gene Sp-Lmo2t (for "truncated").
###Gene_Info_Comments GLEAN3_02633 ###
See also the paper, for the latest tree on Hox affinities: 

Cameron, R.A., Rowen, L., Nesbitt, R., Bloom, S., Rast, J.P., Berney, K., Arenas-Mena, C., Martinez, P., Lucas, S., Richardson, P.M., Davidson, E.H., Peterson, K. J. and Hood, L. (2005) Unusual gene order and organization of the sea urchin HOX cluster. J. Exp. Zoology. In press.
###Gene_Info_Comments GLEAN3_02631 ###
See also the latest tree of Hox affinities:

Cameron, R.A., Rowen, L., Nesbitt, R., Bloom, S., Rast, J.P., Berney, K., Arenas-Mena, C., Martinez, P., Lucas, S., Richardson, P.M., Davidson, E.H., Peterson, K. J. and Hood, L. (2005) Unusual gene order and organization of the sea urchin HOX cluster. J. Exp. Zoology. In press.
###Gene_Info_Comments GLEAN3_02632 ###
See also the latest tree of Hox affinities:

Cameron, R.A., Rowen, L., Nesbitt, R., Bloom, S., Rast, J.P., Berney, K., Arenas-Mena, C., Martinez, P., Lucas, S., Richardson, P.M., Davidson, E.H., Peterson, K. J. and Hood, L. (2005) Unusual gene order and organization of the sea urchin HOX cluster. J. Exp. Zoology. In press.
###Gene_Info_Comments GLEAN3_05169 ###
-THE GLEAN3 PREDICTION COVERS TWO FUSED GENES:SpHox5 and an Acetylcholinesterase. It is a known fact that these genes are very close (or fused?) in the sea urchin genome.
I have seen the exon 2 of Hox genes (indicated in the gene model). The other exons are, most probably (given BLAST values), fragments of a Cholinesterase gene.

-See also the latest tree of Hox affinities:

Cameron, R.A., Rowen, L., Nesbitt, R., Bloom, S., Rast, J.P., Berney, K., Arenas-Mena, C., Martinez, P., Lucas, S., Richardson, P.M., Davidson, E.H., Peterson, K. J. and Hood, L. (2005) Unusual gene order and organization of the sea urchin HOX cluster. J. Exp. Zoology. In press.
###Gene_Info_Comments GLEAN3_21309 ###
See also the latest tree of Hox affinities:

Cameron, R.A., Rowen, L., Nesbitt, R., Bloom, S., Rast, J.P., Berney, K., Arenas-Mena, C., Martinez, P., Lucas, S., Richardson, P.M., Davidson, E.H., Peterson, K. J. and Hood, L. (2005) Unusual gene order and organization of the sea urchin HOX cluster. J. Exp. Zoology. In press.
###Gene_Info_Comments GLEAN3_27568 ###
See also the latest tree of Hox affinities:

Cameron, R.A., Rowen, L., Nesbitt, R., Bloom, S., Rast, J.P., Berney, K., Arenas-Mena, C., Martinez, P., Lucas, S., Richardson, P.M., Davidson, E.H., Peterson, K. J. and Hood, L. (2005) Unusual gene order and organization of the sea urchin HOX cluster. J. Exp. Zoology. In press.
###Gene_Info_Comments GLEAN3_19586 ###
This model was annotated based on a manual inspection of multiple protein sequence alignments.
###Gene_Info_Comments GLEAN3_24044 ###
This model was annotated and modified based on full length cDNA sequences from Courtney Smith and Peggy Stevens.  The gene was first isolated from a coelomocyte library (Smith et al. 1996. J. Immunol. 156:593).  QPCR analyses indicates embryonic expression in gastrula.  In situ indicates expression in SMC.

The original version of GLEAN3_24044 was incomplete (C-terminus missing). The rest of its sequence was found on a partly duplicated model (GLEAN3_26748) and annotated by Christian Gache (Villefranche-sur-Mer, France). Both models have now been fused in a modified version of GLEAN3_24044 that is supported by cDNA sequence.
###Gene_Info_Comments GLEAN3_09178 ###
The same gene is found in three different glean3 models, 2 are haplotypes, and 1 contains another part of the gene. 
###Gene_Info_Comments GLEAN3_28698 ###
The same gene is found in three different glean3 models, 2 are haplotypes, and 1 contains another part of the gene. 
###Gene_Info_Comments GLEAN3_18813 ###
Aligns with S. purpuatus SM37. Lee et al (1999, Develop. Growth Differ 41: 303-312 PUB MED:10400392). Is on the same scaffold as SM50 which is consistant with Lee et al.'s findings that SM37 and SM50 are linked.

Matches c-type lectin domain (cd00037).


###Gene_Info_Comments GLEAN3_12979 ###
See GLEAN3_02875. 
###Gene_Info_Comments GLEAN3_26899 ###
This Glean has 74% identies on the nt. level with Mus Musculus DYRK2 mRNA (from 614 to 1687 out of 2165).
###Gene_Info_Comments GLEAN3_00870 ###
This gene model may represent a pseudogene or contain a sequence error. Intron sequence matches coding sequence of other Sp-Tlr genes and contains frame shift.
###Gene_Info_Comments GLEAN3_24900 ###
See GLEAN3_23252. 
###Gene_Info_Comments GLEAN3_22703 ###
See GLEAN3_08370.  N- and C-termini are shorter than those of other organism.  
###Gene_Info_Comments GLEAN3_27462 ###
This is the full sequence of PLCg.  It is missing sequences for approx. 40 amino acids in the reigon of 176-217.

The other scaffolds are located on 464 (Glean 10275) and 801 (glean 06056), However these have been added to this annotation and contain only inclomplete modles.
###Gene_Info_Comments GLEAN3_08370 ###
See GLEAN3_22703.  N- and C-termini of the prediction is shorter than those of other organism.  
###Gene_Info_Comments GLEAN3_20904 ###
We pulled this partical cDNA from an Sp egg library.  It aligns with the first half of this Glean prediction.  Glean3_05654 is also a good BLAST hit.  
###Gene_Info_Comments GLEAN3_07964 ###
This gene was annotated based on a curated analysis of alignments to known vertebrate genes.

The protein inhibitor of activated STAT (PIAS) family of proteins has been proposed to regulate the activity of many transcription factors, including STATs, and recent genetic studies support an in vivo function for PIAS proteins in the regulation of innate immune responses.

This gene model was modified at the time of annotation. One of its exons was removed from its original version, since its expression was not supported by the embryonic tiling array data, which otherwise strongly support the expression of every other exon in this model.

Once the exon was removed, the alignment of this model to its vertebrate counterpart was significantly improved.

Also note that unaccounted exons for this model might exist, also based on the tiling array data.
###Gene_Info_Comments GLEAN3_05339 ###
The intron of this GLEAN3 model was modified to a coding region by comparison to the corresonding FgeneshAB prediction. 

Intronless Toll-like receptor with predicted secretory signal peptide, LRR-NT, LRR(16 to 23), LRR-CT, TM and TIR.    
###Gene_Info_Comments GLEAN3_27144 ###
one of two, duplication. The other is _02836. GLEAN3_22717 is overlapping and mostly non-identical. This and _02836 are the internal sequences, while _06197 is the N-terminal sequence

###Gene_Info_Comments GLEAN3_02836 ###
one of two, duplicate. Other is _27144. GLEAN3_22717 is a mostly non-identical overlapping duplicate. This and _27144 are internal sequence, while the N terminal part of this gene is in _06197
###Gene_Info_Comments GLEAN3_11536 ###
Toll-like receptor with predicted secretory signal peptide, LRR-NT, LRR(11 to 22), LRR-CT and TIR. Unknown sequence (NNN...) in the intron of the current model could make this gene model incomplete.
###Gene_Info_Comments GLEAN3_09762 ###
Proteins structure suggests that Strabismus proteins may be members of the Ltap protein family (Kibar, 2001).
###Gene_Info_Comments GLEAN3_12078 ###
This annotation was done with alignments using A. pectinifera mRNA and peptide sequence.  There are a total of 10 scaffolds and 43 exons (predicted).  Of note, in Glean3_24561 (Scaffold101715) there are two predicted exons which are out of sequencial order on the scaffold, but do not overlap in their nt. sequence of the protein.  In Glean3_27674 (Scaffold102204) there are two predicted exons which are in sequential order, but have the same nt. sequence.  One was omitted.  Changes have been made to other Glean3 predictions in this annotation, but not in the origional glean3 predictions.  Other glean3 predictions of this protein were accepted, usually with little or no change.  They also all refer to this Glean3 prediction in the comments.  Glean3_12078 aligns with of ApIP3R from AAs 316-903.
###Gene_Info_Comments GLEAN3_10424 ###
Alternate transcripts, beta-1 and beta-3:
beta-1 NCBI accession #: NM_001032368
beta-3 NCBI accession #: NM_001032369

There are 4 basepairs from mRNA missed between exon 1 and exon 2.
###Gene_Info_Comments GLEAN3_22820 ###
A small fragment (168921 - 169128, 211bp) is highly identical to Sp-Soxb2 (320520 - 320730) on Scaffold467, GLEAN3_25113.
###Gene_Info_Comments GLEAN3_28103 ###
Could be a continuation of the Sp-proteoliaisin (GLEAN3_PLN), based on high sequence identity to the CDS across the repetitive low-density lipoprotein repeat.  NOT CERTAIN, however, given the absence of the complete proteoliaisin cDNA.
###Gene_Info_Comments GLEAN3_00985 ###
This gene model may represent a pseudogene or contain a sequence error. Intron sequence matches coding sequence of other Sp-Tlr genes and contains stop codons. Modified gene model has some frame shifts, but reflects best gene structure.

###Gene_Info_Comments GLEAN3_27023 ###
CDS sequences are part of rendezvin.  See GLEAN3_19369 for which exons they encode.
###Gene_Info_Comments GLEAN3_24336 ###
Joins with GLEAN3_24337.
###Gene_Info_Comments GLEAN3_24337 ###
Joins with other scaffolds.  Also note: 
- exon 7 start is not defined because of unknown sequence (NNN...)
- exon 14 is missing 1 bp after 29361
- exon 15 is missing 12 bp after 29180
###Gene_Info_Comments GLEAN3_21062 ###
Encodes the final exons of the gene on GLEAN3_24337.
###Gene_Info_Comments GLEAN3_25113 ###
A small fragment (320520 - 320730, 208bp) is highly identical to Sp-Soxb1 (168921 - 169128) on Scaffold732, GLEAN3_22820.
###Gene_Info_Comments GLEAN3_01970 ###
Intronless Toll-like receptor with predicted secretory signal peptide, LRR-NT, LRR(6 to 10), LRR-CT, TM and TIR.
###Gene_Info_Comments GLEAN3_01971 ###
Intronless Toll-like receptor with predicted secretory signal peptide, LRR-NT, LRR(7 to 10), LRR-CT, TM and TIR.
###Gene_Info_Comments GLEAN3_03684 ###
This gene model may represent a pseudogene or contain a sequence error. Intron sequence matches coding sequence of other Sp-Tlr genes and contains stop codons. Modified gene model has some frame shifts, but reflects best gene structure.
This is a member of sea urchin-specific Tlr Group I(orphan).
###Gene_Info_Comments GLEAN3_04791 ###
605 bp intron was modified to a coding region by comparison to the corresponding FgenesAB prediction.
###Gene_Info_Comments GLEAN3_26687 ###
>Spec2a protein sequence:
MAVQLLFTEEEKALFKSSFKSEDTDGDGKITSEELRAAFKSIEIDLTQEKIDEMMGMVDK
DGSKDMDFSEFLMRKAEQWRGREVQLTKAFVDLDKDHNGSLSPQELRTAMSACTDPPMTE
KEIDAIIEKADCNGDGKICLEEFMKLIHSS

>CDS of Spec2a gene
atggctgtcc aattattatt taccgaagag gaaaaagctt tattcaaaag ctccttcaaa        60
tcagaagaca cggatggcga tggcaaaatc acttctgaag agttgagagc agcgtttaaa       120
tcaattgaaa tagacttgac tcaggaaaag attgacgaaa tgatgggaat ggttgataaa       180
gatggtagca aagatatgga cttttctgag tttttgatga ggaaggcaga acagtggcgc       240
ggaagagaag tacaattaac taaagctttc gtcgacttgg acaaggatca caacggatcc       300
ctcagtcctc aagagctgcg tacagcgatg tcagcatgca ccgatccacc gatgacggag       360
aaggaaatcg atgcaatcat cgagaaagcc gactgcaatg gggacggtaa aatctgcctt       420
gaagaattca tgaaattgat tcactcgtct taa       453 
###Gene_Info_Comments GLEAN3_27987 ###
In addition to duplicated copy, there is a third copy not in the Glean3 list on scaffold 134677
###Gene_Info_Comments GLEAN3_15027 ###
This prediction covers partial CDS that is missing C-terminal sequences, which are probably on another scaffold.  On the gene duplications page, I have listed another copy also lacking C-terminal sequences.  This second copy is on a scaffold, 52540, which contains multiple ACE-like genes, some complete and some partial.  Finally, there is another copy in the genome that is not on the GLEAN3 list on  Scaffold 123895.
###Gene_Info_Comments GLEAN3_21354 ###
GLEAN3_21354 should be linked to GLEAN3_21353
###Gene_Info_Comments GLEAN3_21351 ###
This prediction is one of many ACE genes on scaffold 52540.

Alignment with best blast sequence and transcriptome signals suggest that there may be a missing exon 3' of GLEAN3_21351|Scaffold52540|76503|76746| and 2 exons may be incorrect (see below).
GLEAN3_21351|Scaffold52540|77756|77824| 
>GLEAN3_21351|Scaffold52540|77756|77824| DNA_SRC: Scaffold52540 START: 77756 STOP: 77824 STRAND: + 
GATCTTCCTCGCTTCGTCTTTCATCTCTAAGCCATAGTTTCCCGACAGCATGGACACATTAGACAAAAA
GLEAN3_21351|Scaffold52540|83924|84128
>GLEAN3_21351|Scaffold52540|83924|84128| DNA_SRC: Scaffold52540 START: 83924 STOP: 84128 STRAND: + 
CGTCCAATGCATATTGGACACAGTTCGTGCACAACGGACAGAGTTCCGCATTGCTATCCAAGCATGCCAT
ATGTCCTAGAACTCCCTTCGTCTTTTCGACAAACTGATCAAGGTTGTCGAGCTTGAGCTGAAGATAACAC
GTCACGTTGGTGATCAGGAGTACTGTAGCGAGAACCCCGTATCCGGTCCATCGGAGGGACGCCAT

###Gene_Info_Comments GLEAN3_21357 ###
This prediction is one of many ACE genes or parts of genes on scaffold52540.
###Gene_Info_Comments GLEAN3_26751 ###
Two comments:

1)Exon 3 of this prediction encodes reverse transcriptase domain.  Transcriptome data suggests that this exon is represented in embryo RNA, but cross-reaction cannot be excluded.
>GLEAN3_26751|Scaffold1575|15835|16444| DNA_SRC: Scaffold1575 START: 15835 STOP: 16444 STRAND: + 
AATGAATTCCGTCTAGGAAGATCTACAGTAGCGCGAATTCTAACTTTGCGGAGACTGGTGGAAGGTACTA
AAGCAAAGCATCTGACAGCAGTACTTACGTTCGTGGGTTTTAAGAAGGCCTTCGATTCAATCAATAGGAA
GAAGATGTTAGAGATCTTAAGAGCCTACGGAATACCATACACAATAGTCACAGCAGTAGGGTTGCTGGAC
AAAGTTACTACAGCTCAAGTGCGTTCACCAAATGGAGAGACTGACTACTTTACCATCTTAGCAGGAGTGC
TCCAAGGCAATACTTTAGCACCATACCTATTCATCGTAGCATTGAATTATGCTCTAAGAATGGCTACTGA
ATGGTTCGAGGATCTGGGCTTTACCCTAGAGGAAAGAGAAAGTAGCAGATATTCTGCTGTAATGATCACA
GATACTGACTTTGCTGATGATATTGCACTAATTTCAGACAATGTGGAAAAGGCACAGAAGCTCCTAAAAC
AACTAAAGTCTGCAGCAAGTCAAATCGGTCTACAAATAAACAGTACTAAGACAGAATTCAAGATGTACAA
CCTTCAGCCTATATTTCACACATATCGTCATTTGCCTGACGTAGGAATGC

In addition to duplicated copies on the glean3 list, there are many not on the glean3 list with e values>100.  Scaffolds 55516, 93134, 115888, 20, 302, 807, 102465, 95798, 1755, 11431, 52209, 431, 87717, 1081, 53542, 138714, 51779, 86354, 119129, 1241, 1111, 28699, 58694, 18304.  The ACE family cannot be described from the current assembly many fragments are scattered both on the glean3 list and outside of it.

###Gene_Info_Comments GLEAN3_28021 ###
Alignment with best blast sequence suggests that the model may lack N-terminal sequence.
###Gene_Info_Comments GLEAN3_12611 ###
Prediction covers partial CDS as inferred from alignments with best blast hit.
###Gene_Info_Comments GLEAN3_27362 ###
TWO COMMENTS:

1)Last  three exons are unlikely to be part of this gene and should be deleted from the model.

Exon 8 is COG5048, COG5048, FOG: Zn-finger;  not blasting to thiamet oligopeptidease

>GLEAN3_27362|Scaffold50623|88407|89245| DNA_SRC: Scaffold50623 START: 88407 STOP: 89245 STRAND: + 
CTGTATGGATGTGTTTATGTCTTGTGTGATGAGTTTCTTGATTAAATGTCTTACCACATTGATCACATAC
ATAGGGCTTCTCACCTGTATGGGTGTGTTTATGTCTTGTGTGATGAGTTTCTTGATTAAATGTCTTACCG
CATTGATCACATACATAGGGCTTCTCACCTGTATGGGTGTGTTTATGTCTTGTGTGATGAGTTTCTTGAT
TAAATGTCTTACCACATTGATCACATACATAGGGCTTCTCACCTGTATGGATGTGTTTATGTCTTGTGTG
ATGAGTTTCTTGATGAATGTCTTACCACATTGATCACATACATAGGGCTTCTCACCTGTATGGATGTGTT
TATGTCTTGTGTGATGAGTTTCTTGATTAAATGTCTTACCACATTGATCACATACATAGGGCTTCTCACC
TGTATGGATGTGTTTATGTCTTGTGTGATGAGTTTCTTGATTAAATGTCTTACCACATTGATCACATACA
TAGGGCTTCTCACCTGTATGGATGTGTTTATGTCTTGTGTGATGAGTTTCTTGATTAAATGTCTTACCAC
ATTGATCACATACATAAGGCTTCTCACCTGTATGGATGCGCTTATGTCTTGTGTGATGAGTTTCTTGATT
AAATGTCTTTCCACATTGATCACATACATAAGGCTTCTCACCTGTATGGATGCGCTTATGTCTTGTGTGA
TGAGTTTCTTGATTAAATGTCTTACCACATTGATCACATACATAGGGCTTCTCACCTGTATGGATGTGTT
TATGTCTTGTGTGATGAGTTTCTTGATTAAATGTCTTACCACATTGATCACATACATAGGGCTTCTCAC

Exons 9 and 10 do not blast to thiamet oligopeptidease.  This regions contans a SIR1 domain:  COG0846, SIR2, NAD-dependent protein deacetylases, SIR2 family

>GLEAN3_27362|Scaffold50623|92553|92742| DNA_SRC: Scaffold50623 START: 92553 STOP: 92742 STRAND: + 
CACTCTCCACATCTGGCAATAGGCATGCATGGATATAGGAATGGATAGCGTGTTCACATCTCCACCGACC
CCCATGGTCGATGAGAGGGCCTACTCTCCTGTATCGACGGCGGGTGATGAGGTCATCCACAAATTGCCGA
ATGATGAAATAAAGACGGCTAGCCATGGGTTCGTCTTCACCCACACGCCG
>GLEAN3_27362|Scaffold50623|94169|94282| DNA_SRC: Scaffold50623 START: 94169 STOP: 94282 STRAND: + 
CTGGTAGATGTCCAGTGACCCCTGGATGACCTTCTCGAGGGGGAAGTACTCCTTGAGCTTGTTGTGGTCC
ACAGAGTAGCGCTTCTCTTCGGACATGTTCATGAAGTAGCGCAT
		
2) In additions to multiple copies on the glean3 list, there are excellent matches on scaffolds 71773 and 89507.


###Gene_Info_Comments GLEAN3_26549 ###
Alignment with best blast sequence suggests that the model is correct.
###Gene_Info_Comments GLEAN3_02117 ###
Exon sequences below contain chromo and ChSh domains characteristic of some nuclear proteins, and are unlikely to be part of this gene and should be removed from the gene model.
>GLEAN3_02117|Scaffold310|75820|75949| DNA_SRC: Scaffold310 START: 75820 STOP: 75949 STRAND: + 
TTATCTTTTGTCCTCCTCGTCGTCTTCACTGTGCCACGTCAAACGCTCTTCGTAGAACTGGATGACAATC
TGAGGGCACCTGTGGTTGGCTTCCTTGGCTCGCACAAGGTCTGCTTCGTTGTTGTTTTTC
>GLEAN3_02117|Scaffold310|77625|77755| DNA_SRC: Scaffold310 START: 77625 STOP: 77755 STRAND: + 
CACTTCATGAGGAATAGGAGTTCATTGTTGGATTCTGTGGCACCAATGATTCTCTCTGGATCCAGCCCTC
TGTCAAATCCTCTGTACTTTTTCTCATCTTGTTTCGTCTGCTTGCCATCTTTGGAAGGCCC
>GLEAN3_02117|Scaffold310|78216|78405| DNA_SRC: Scaffold310 START: 78216 STOP: 78405 STRAND: + 
AGATGAGTCTTTCCTCTTCTCGGTTGCAGCAGCTTCTTCTTTCCTTCTTTTCTTTGCTGCTACATCTCCA
GCAGCAACATTTTGAGCTGATTTTCTCTTTAAGGCCTCCTTTTCTCGGATCTTCTTTTCATACGCCTCAA
TTAAATCAGGGCACTCTAGATTGTCCTGGGGTTCCCATGTCGATTCATCA
>GLEAN3_02117|Scaffold310|80007|80191| DNA_SRC: Scaffold310 START: 80007 STOP: 80191 STRAND: + 
TCTCCATAGCCCTTCCACTTGAGGAGGTATTCTACTCTTCCTTTGTGTATCCTCTTATCGACAACCTTCT
CCACTTGGTAGACCTCTTCTTCTTCCTCCTCACTTTCTCCCTCGGTTTTCTTTTCTTCATTTTCTTCACC
TTCTGGTTCAGGCCCATCTTCAGGTTTCCTCTGCTTTTTGCCCAT

###Gene_Info_Comments GLEAN3_10808 ###
Some exons may be not belong in this gene:  The first three at one end of the gene blast to nothing in the nr database; the second set of two are in the middle of the gene and also do not blast to anything.

>GLEAN3_10808|Scaffold1433|165949|166138| DNA_SRC: Scaffold1433 START: 165949 STOP: 166138 STRAND: + 
ATGACAATGGATACAAATAAGAGGAACACCATGATAAATCTCAAATTAATTCTGACTGTGATGATCATCA
TTTTTCTACAATGTTGGGAAGCTACTTCTCTGTCGTCTGCTCCAGCTCCTAGCCGTTGCATATTTGATGA
AGTTCAAAAGCATCAAAACGTAGAAAGAACACTTATAAAATACCATCCAG
>GLEAN3_10808|Scaffold1433|166548|166721| DNA_SRC: Scaffold1433 START: 166548 STOP: 166721 STRAND: + 
GTGATGTAAGCGCAAAATCAAAGAGGTCAGTAGAAGAAGAAGCAAATGCCTACCAGCCAATCAGAGTGAA
GACGTTTGTCCAGAATGAGGAGCATCTGATGGACTCCGTGCAGGTTGAAAAACTAGAGACCATCATGGCT
GGTGCAACATCTGTTGTTCAAAAACTTCTGTCAG
>GLEAN3_10808|Scaffold1433|167584|167668| DNA_SRC: Scaffold1433 START: 167584 STOP: 167668 STRAND: +

>GLEAN3_10808|Scaffold1433|170097|170175| DNA_SRC: Scaffold1433 START: 170097 STOP: 170175 STRAND: + 
CTTGCCCTGCATGAAGCGTTTCATGTTCTTGGATTTTCTACAAGTCTTTTTGACCAGTTTCAAGATTGTA
GTGTATGTG
>GLEAN3_10808|Scaffold1433|170631|170782| DNA_SRC: Scaffold1433 START: 170631 STOP: 170782 STRAND: + 
AAGATGGACTCGAGTGCGAGACAAGAGAGGATGTTGTGAGAGTGGATGCCGGTGGGCAGTCTAGACTCCA
CACCCCAGCAGTCGTGGCTGCATCTCAGATTCATTTTGGCTGCACTGAAGAAGAAGAAATGGGTGTTCCT
CTGGAAAATCTG

###Gene_Info_Comments GLEAN3_11364 ###
In addition to many copies of these gene on the glean3 list, there are several scaffolds that contain excellent matches:  Scaffolds 25161 and 85005.
###Gene_Info_Comments GLEAN3_26072 ###
Exons 1 and 2 cannot be confirmed because they do not blast to endothelin converting enzyme.

>GLEAN3_26072|Scaffold692|19600|19738| DNA_SRC: Scaffold692 START: 19600 STOP: 19738 STRAND: + 
ATGACGAGTAGTCAGGCTAAACTCGCCGTCGATGAGGGTGTCGTTGTCAGACGAAAAGCCCCCAAGGTCA
TTACCAGGAATCTGGTCGTCATCGTTGTCGTCTTGGCACTCCTCACCGTGTCACTTATAGTAGCTACCG
>GLEAN3_26072|Scaffold692|21211|21327| DNA_SRC: Scaffold692 START: 21211 STOP: 21327 STRAND: + 
TTGTAATCGCGTCAGACCGGGATAACCTTTCTTCAAGATTACGATCATATACCGGCCACCAAACCTCACC
ATGCCCTGAACCGAAGCAATGTCTCACGCCCTCTTGTGTTAAAGCAG

###Gene_Info_Comments GLEAN3_04765 ###
Partial cds inferred from alignment with best blast hit.
###Gene_Info_Comments GLEAN3_11071 ###
partial cds inferred from alignment with best blast hit.
###Gene_Info_Comments GLEAN3_02141 ###
Partial cds inferred from alignments with best blast hit.
###Gene_Info_Comments GLEAN3_08959 ###
Alignment with best blast sequence suggests that the model may lack N- and C-terminal sequences.
###Gene_Info_Comments GLEAN3_15178 ###
There appear to be several CPA2 genes tandemly repeated; the other GLEAN3_15179 is partial CDS while this appears to be complete, as inferred with alignments to best blast hits.
###Gene_Info_Comments GLEAN3_15179 ###
There are 2 CPA2-like genes on this scaffold.  This is partial CDS while the other, GLEAN3_15178 appears to be complete, as inferred by alignments to best blast hits.
###Gene_Info_Comments GLEAN3_01397 ###
TWO COMMENTS:

1) This prediction covers partial cds, as inferred from best blast hit alignments;  2N-TERMINAL exons may not belong to this protein, since they do not blast to CPA genes
>GLEAN3_01397|Scaffold22766|21673|21839| DNA_SRC: Scaffold22766 START: 21673 STOP: 21839 STRAND: + 
TTATTTTGCTTCTTCTTCTTCTTCTTCTTCTTCTTCTTCTTCTTCTTCTTCTTCTTCTTCTCCTTCTCCT
TCTTCTTCTTCTTCTTCTTCTTCGTCTTCTTCGTCTTCATTGTCTTCTTTGTCTTCTTCTTCTTCTTCTT
CTTCTGTTTCTTCTTCTTCTTCTTCAT
>GLEAN3_01397|Scaffold22766|21978|22053| DNA_SRC: Scaffold22766 START: 21978 STOP: 22053 STRAND: + 
AGATAACGTTGGCGATAGAGCCAGTCTTGTACGATCTATTACCCATGGCGGATACGGCACTGTCTGATAC
AGATTT

Similarly, 2 C-terminal exons may belong.
>GLEAN3_01397|Scaffold22766|28996|29077| DNA_SRC: Scaffold22766 START: 28996 STOP: 29077 STRAND: + 
ATAGTTTCCCAGTACGCTCTTGAGAACGACTAGCTTCTCCAGCTGCAGGCGATCTTTAGGGGTGACACGG
AATACCTTGTAC
>GLEAN3_01397|Scaffold22766|29479|29540| DNA_SRC: Scaffold22766 START: 29479 STOP: 29540 STRAND: + 
CCATCGTAATTGACGGCTAGCGCAGTAGCTAAGAGAGCAGTGAATACCAGAAAACGCATCA

2) There are 4 CPA-like genes or parts of genes on this scaffold.

###Gene_Info_Comments GLEAN3_01399 ###
One of a set of 4 whole or parts of CPA-like genes
###Gene_Info_Comments GLEAN3_01400 ###
TWO COMMENTS:

1) exon below should be added upstream of present exon1 since it blasts to the N-terminal part of its best blast hit.

>Supertig22766_5|Scaffold22766|32108|32236| DNA_SRC: Scaffold22766 START: 32108 STOP: 32236 STRAND: + 
ATGCTGATAATAATTATTCAATCTCTTTCAGCTCGACTTCTGGAGGGAGGCGACGCCGTCTTCGATCGGT
CGTCCCGTCGACATCATGGTACCATCGAGCCTTCGGAACAACGTCCACGACATACTGAC

2) This is one of 4 whole or parts of CPA-like genes on scaffold 22766.  This model covers partial CDS, as inferred from alignments with best blast hit.
###Gene_Info_Comments GLEAN3_14935 ###
This is one of two CPA-like genes on scaffold 27970.
###Gene_Info_Comments GLEAN3_12020 ###
This is one of two CPA-like genes on scaffold 48388.  The other is GLEAN3_12021.  There are several clusters of CPA genes that may link to form a large cluster.
###Gene_Info_Comments GLEAN3_12021 ###
This is one of two CPA-like genes on scaffold 48388.  The other is GLEAN3_12020.
###Gene_Info_Comments GLEAN3_04157 ###
BLAST data suggest that the following exons should be deleted from this model.

>GLEAN3_04157|Scaffold48533|10538|10720| DNA_SRC: Scaffold48533 START: 10538 STOP: 10720 STRAND: + 
CTCGTCAGCCATGCTGGTCTCTAGCTCCCTCAGCCATTGCCTGTGCTCATCCGTACCAGGGGTCACACGA
AGCACCTGGTATCTGTAAATTAATAAGAGAGACATATAGAGGGGTATGGAGTGATGGGGTATAGAGGGAA
GAGATGGGAGATTGANGGCACGTCTTCGTGCAATCAAGATTGC
>GLEAN3_04157|Scaffold48533|15593|15676| DNA_SRC: Scaffold48533 START: 15593 STOP: 15676 STRAND: + 
CTTGAGTTTGCGTCTAGCGTTGAACTTCTTCAAGCAATCAACGGTCTCCTGTCTGTGCATTGCTGAAGCA
TAGCGATCTCGGTT
>GLEAN3_04157|Scaffold48533|21092|21201| DNA_SRC: Scaffold48533 START: 21092 STOP: 21201 STRAND: + 
CTGGATCCATGGATGTTTTAGAGCCTGGCAGGCAGAGATACGCTTTCCTGGGTTGACTGTCAGCATGCTA
TCTATCAAGTTCTTTGCTTCAGGTGTCACTGTGTCCCATT
>GLEAN3_04157|Scaffold48533|22651|22855| DNA_SRC: Scaffold48533 START: 22651 STOP: 22855 STRAND: + 
CTGAATTAGACTCCAATCCACCATCTTGAACAATCTCCTCAAAATCCATTGAATGGCTGGCATTGAATAA
TCCAGTCTTCGCTTTGACGATGGCATACTGCCCTTTGAAGGCAGACATATTGCCACAGGTAGCTTGACTT
TGGATGTATCGTAGGGTCTTCCCACTGCGAGTCATGCACTTCTTTCTCCCGACGCTTGCTGCCAT

###Gene_Info_Comments GLEAN3_22102 ###
This is one of two CPA-like genes on scaffold114652.  There are clusters of CPA-like genes on several scaffolds, raising the possibility that large clusters of these genes exist.
###Gene_Info_Comments GLEAN3_09007 ###
partial CDS, C-terminal only, because this is a short scaffold.
###Gene_Info_Comments GLEAN3_27451 ###
TWO COMMENTS:

1) partial cds; note that there are repeated elements in the cds of the best blast hit.

2) In addition to duplicated copy on glean3 list, there is also an excellent match on scaffold 73285.
###Gene_Info_Comments GLEAN3_04957 ###
This gene model may represent a pseudogene or contain a sequence error. 5' upstream sequence matches coding sequence of other Sp-Tlr genes and contains stop codons . Modified gene model has some frame shifts, but reflects best gene structure.
This is a member of sea urchin-specific Tlr Group IA.
###Gene_Info_Comments GLEAN3_05830 ###
This gene model may represent a pseudogene or contain a sequence error. Intron sequence matches coding sequence of other Sp-Tlr genes and contains stop codons (frame shift).
###Gene_Info_Comments GLEAN3_06218 ###
This gene model may represent a pseudogene or contain a sequence error. 5' upstream sequence matches coding sequence of other Sp-Tlr genes and contains stop codons. Modified gene model has some frame shifts, but reflects best gene structure.
###Gene_Info_Comments GLEAN3_25601 ###
Part of the cDNA sequence (AY130972 ) could not be found anywhere in the genome. Therefore, accept gene model prediction for exon 2.

Exon 1-3 of this gene are on Scaffold 87957 (GLEAN3_17106). Exon 3-7 are on Scaffold 107218 (GLEAN3_25601). The two scaffolds have exon 3 and flanking sequences overlap and can be aligned quite well (e=0). Therefore, the two scaffold might be assembled as one scaffold. The modified gene model is composed of Exon 1-3 from Scaffold 87957 and exon 4-7 from scaffold 107218.

Exon 5-7 are also on Scaffold 55268 (GLEAN3_05718). This might be another allele of this gene.

Please refer GLEAN3_17106 for the modified gene model.
###Gene_Info_Comments GLEAN3_05718 ###
Part of the cDNA sequence (AY130972 ) could not be found anywhere in the genome. Therefore, accept gene model prediction for exon 2.

Exon 1-3 of this gene are on Scaffold 87957 (GLEAN3_17106). Exon 3-7 are on Scaffold 107218 (GLEAN3_25601). The two scaffolds have exon 3 and flanking sequences overlap and can be aligned quite well (e=0). Therefore, the two scaffold might be assembled as one scaffold. The modified gene model is composed of Exon 1-3 from Scaffold 87957 and exon 4-7 from scaffold 107218.

Exon 5-7 are also on Scaffold 55268 (GLEAN3_05718). This might be another allele of this gene.

Please refer GLEAN3_17106 for the modified gene model.
###Gene_Info_Comments GLEAN3_27819 ###
Possible gene duplication 
GLEAN3_16081
###Gene_Info_Comments GLEAN3_16081 ###
Possible gene duplication
GLEAN3_27819
###Gene_Info_Comments GLEAN3_23660 ###
Partial CDS.  Note there are repetitive elements in the best blast hit.
###Gene_Info_Comments GLEAN3_17575 ###
partial cds; Note that there are repetitive elements in the best blast hit.
###Gene_Info_Comments GLEAN3_07682 ###
There is an excellent match to part of this predicted gene on scaffold 34914_1, but it is not on the glean3 list.
###Gene_Info_Comments GLEAN3_21609 ###
partial CDS, inferred from alignments with best blast hits.  N-terminal sequence is not included on this scaffold, 52285.
###Gene_Info_Comments GLEAN3_03363 ###
Partial CDS, as inferred from best blast hit alignment.  One predicted and fairly long exon, listed below, is probably not included in this protein because it does not blast to carboxypeptidase E.
>GLEAN3_03363|Scaffold112125|14529|14828| DNA_SRC: Scaffold112125 START: 14529 STOP: 14828 STRAND: + 
TCAACCTTTACTTATTGTGTTGTTAAATAAATTCTGTCGGGTTACAATTCTCGGAGGGGGTTGGGCGTCG
GTGAGTGTGATGAGAATAATGGTTGTGAAGATGATTTTGACAACTCTCAGATTTGATTTTATGATGACGA
AGACACCGATGTTAACGATGGTGGTGTTGATGAAGCTGATGCTGCTGCTGCTGATGATGATGATAGTGAT
GATGATTAACGCTGCACTTGCTATGTTGTGGCGTTTGTCGAGGTCAAGGATAATGATTATGATCATAAAA
TCAATCATTATAGCAATGAA
###Gene_Info_Comments GLEAN3_26013 ###
partial CDS, as inferred from alignment with best blast hit.  Appears to missing N-terminal sequences.
###Gene_Info_Comments GLEAN3_16331 ###
partial CDS, as inferred by alignments with best blast hit. Model  appears to missing N-terminal sequences.
###Gene_Info_Comments GLEAN3_20494 ###
1)This model, GLEAN3_20494 and the neigboring model, GLEAN_320393, appear to be parts of the same gene, as inferred from alignment with best blast hit.

###Gene_Info_Comments GLEAN3_01638 ###
Subgroup A thrombospondin with TSP type 1 repeats
###Gene_Info_Comments GLEAN3_07105 ###
This gene model may represent a pseudogene or contain a sequence error. A part of intron sequence matches coding sequence of other Sp-Tlr genes and contains stop codons. Modified gene model has some frame shifts, but reflects best gene structure.
New assembly doesn't show any frame shifts and stop codons.
This is a member of sea urchin-specific Tlr Group IB. 
###Gene_Info_Comments GLEAN3_09755 ###
Amino terminal half of the protein is missing due to end of contig.

This region is slightly more closely related to mammalian SubgroupA TSPs than subgroupB based on blast.
###Gene_Info_Comments GLEAN3_07430 ###
This gene model may represent a pseudogene or contain a sequence error. Intron sequence matches coding sequence of other Sp-Tlr genes and contains stop codons. Modified gene model has some frame shifts, but reflects best gene structure.
This is a member of sea urchin-specific Tlr Group IC.
###Gene_Info_Comments GLEAN3_09230 ###
partial CDS; 
TWO COMMENTS:
1) Three exons in the model may not belong because they blast to a cub domain not normally found in this class of peptidase.  >GLEAN3_09230|Scaffold376|224783|225029| DNA_SRC: Scaffold376 START: 224783 STOP: 225029 STRAND: + 
GAGTGGTCCATCACAGCACCCCTTCAGAGGAGGATCCTCGTCACTTTTAATGACTTAAAGTTGGAATCTC
CTTTTGATTTTATTGTCATTCGGGACATGTACCGTGATAAAAAGCCCTACACAGGCGAGAACATGCTACT
CCATCCATTTCTGACGTTAGGACGTACTCTTGATATTGAGTTCTCCTCATCTAGAAGTGGCAGAAGGAGA
GGATTCAATATTTCAGTCTCATGTAGCGAACTTTCAA
>GLEAN3_09230|Scaffold376|226152|226229| DNA_SRC: Scaffold376 START: 226152 STOP: 226229 STRAND: + 
ATACATATCGTCTGATGGATTGCGCAGCAGAGCATTCAATGTGCTTGTCTGAGAAGGGCGTCAAAATAAA
ATGTCATG
>GLEAN3_09230|Scaffold376|227528|227685| DNA_SRC: Scaffold376 START: 227528 STOP: 227685 STRAND: + 
CTTCTGTGGGTTTCTGTGAGGAACTAAACGACACGATTGATGGATCTTGGGATCCTAACATCACATGGTT
TGGTTCTATCGTTCATCGTACATGTATGGATGGATACAGTCTAAAAGGCAATGGAACCCTGCAATGTGTG
CCGGGGTATCACCATTGA
2)Six N-terminal exons are questionable because there is no conservation with this class of peptidase based on alignments to best blast hits.
>GLEAN3_09230|Scaffold376|53883|53961| DNA_SRC: Scaffold376 START: 53883 STOP: 53961 STRAND: + 
ATGCATGTTGATCGTTGTACAACGGTGATAACCGGTGCAACGCACTGTCCTTGGTTCAGTGCCTTTCCCA
TTGATACCT
>GLEAN3_09230|Scaffold376|59076|59173| DNA_SRC: Scaffold376 START: 59076 STOP: 59173 STRAND: + 
GTGACGGTAAAGACTCTGGAATTTTACTCATTGATGAAAGGACAAAAGCAGTAATGACTGACCAACCAAG
ACATGCACAAGAAGCTTTCAAGGAACAG
>GLEAN3_09230|Scaffold376|60698|60806| DNA_SRC: Scaffold376 START: 60698 STOP: 60806 STRAND: + 
ATTGCCAAAGTTCGTGAGTTGGTTCCTACCCGGAGTAGAGATGATATTGCACTGGTTCTTCAATGCCATG
AGGGAAATGTGGATAAAGCAGTACAGTCATTCATAGACG
>GLEAN3_09230|Scaffold376|66479|66525| DNA_SRC: Scaffold376 START: 66479 STOP: 66525 STRAND: + 
ATGGAGCCAAAACTGTTTTGAATGAGTGGCAGTCGCATGGCAAGAAG
>GLEAN3_09230|Scaffold376|69825|69978| DNA_SRC: Scaffold376 START: 69825 STOP: 69978 STRAND: + 
TCTGCAAATAAGAGAAACAAGAAAAAGAAACGAGGCCCTGATGCACCAGATGAGAAATCAAATGGTGGTG
ATGCTGCTGTAGCTAGTAAAACAGGTAAATATAACGCACTAGAGCAATTCCATGGAAAGTTGCCTAAGAC
GGGCAACATGCAAG
>GLEAN3_09230|Scaffold376|103311|103387| DNA_SRC: Scaffold376 START: 103311 STOP: 103387 STRAND: + 
GTGAAGTAAATGGGTACCTGGTAGGAATTTATTCCTTGAAACGCAGCGCGCGTAACAGCTGCACTGCTAA
AGCCAGG


###Gene_Info_Comments GLEAN3_19059 ###
partial CDS; 3 N'terminal exons are questionable as the sequences are not conserved with other metalloprotease1 genes.
>GLEAN3_19059|Scaffold64363|806|935| DNA_SRC: Scaffold64363 START: 806 STOP: 935 STRAND: + 
ATGCTTGGAAAGAAAGTGGAAGGATCCGGACTTGAAGATATCCTTTTGGAAGCTGGTCTGATGTCTTCTG
GGTCTATAAAAGATGTGTCAACAACAGTGCGACAGGAGTCTGCATTGTCACAAGACAATG
>GLEAN3_19059|Scaffold64363|2797|2824| DNA_SRC: Scaffold64363 START: 2797 STOP: 2824 STRAND: + 
TGCTGAGTGGAGGAGAAGCTATGCTGTT
>GLEAN3_19059|Scaffold64363|7517|7616| DNA_SRC: Scaffold64363 START: 7517 STOP: 7616 STRAND: + 
TGCTGAGTGGAGGAGAAGCTATGCTGTTGTGAGTAAAGCCCAGGAGCGAGCAAAACAATACCAACCAGGA
GACAGGCTCCATGGCTTCTCGGTGGAGAAA

###Gene_Info_Comments GLEAN3_07850 ###
This gene model may represent a pseudogene or contain a sequence error. Intron sequence matches coding sequence of other Sp-Tlr genes and contains stop codons. Modified gene model has some flame shifts, but reflects best gene structure.
This is a member of sea urchin-specific Tlr Group IA.
###Gene_Info_Comments GLEAN3_07986 ###
This gene model may represent a pseudogene or contain a sequence error. Intron sequence matches coding sequence of other Sp-Tlr genes and contains stop codons. Modified gene model has some frame shifts, but reflects best gene structure.
This is a member of sea urchin-specific Tlr Group IE.

###Gene_Info_Comments GLEAN3_08267 ###
Intronless Toll-like receptor with predicted secretory signal peptide, LRR-NT, LRR(18 to 24), LRR-CT, TM and TIR. 
###Gene_Info_Comments GLEAN3_09129 ###
Unknown sequence (NNN...) in the intron of the current model could make this gene model incomplete. Modified gene model includes the long unknown sequence.
###Gene_Info_Comments GLEAN3_09829 ###
This gene model may represent a pseudogene or contain a sequence error. Intron sequence matches coding sequence of other Sp-Tlr genes and contains stop codons. Modified gene model has some flame shifts, but reflects best gene structure.
This is a member of sea urchin-specific Tlr Group I(orphan).
###Gene_Info_Comments GLEAN3_15920 ###
5' end of the gene is missing because scaffold data is incomplete.  This gene is on same scaffold adjacent to Sp-AlphaD
###Gene_Info_Comments GLEAN3_10940 ###
This gene model may represent a pseudogene or contain a sequence error. Intron sequence matches coding sequence of other Sp-Tlr genes and contains stop codons. Modified gene model has some frame shifts, but reflects best gene structure.
This is a member of sea urchin specific Tlr Group IA.
###Gene_Info_Comments GLEAN3_11030 ###
This is longer than most homologs/orthologs.  May be duplication of GLEAN3_11029 with actual starting codon at bp 105. 
###Gene_Info_Comments GLEAN3_10619 ###
This gene model may represent a pseudogene or contain a sequence error. Intron sequence matches coding sequence of other Sp-Tlr genes and contains stop codons. Modified gene model has some frame shifts, but reflects best gene structure.
This is a member of sea urchin specific Tlr Group IA.
###Gene_Info_Comments GLEAN3_11540 ###
Intron of this gene model was modified to a coding region by comparison to the corresponding FgenesAB, ++ and Genscan prediction.
This is a member of sea urchin specific Tlr Group IA.
###Gene_Info_Comments GLEAN3_11949 ###
The intron of this gene model included a long unknown sequence. So the model was modified as an intronless gene by comparison to the corresponding FgenesAB prediction.
This is a member of sea urchin-specific Tlr Group IA.
###Gene_Info_Comments GLEAN3_13144 ###
partial CDS; Contains C-terminal half sequences, based on best blast alignment.  Some sequences of predicted exons are not conserved and are therefore questionable.  They are:
>GLEAN3_13144|Scaffold9356|6814|6907| DNA_SRC: Scaffold9356 START: 6814 STOP: 6907 STRAND: + 
TCATTCCTTGAGTATATTCCATCCCTGACTGGTCATGGTGGTGGTTGGTCCTAGTAGAGTAGTGGCATGG
GTTCTCTTACCATCCATCAAGTAC
>GLEAN3_13144|Scaffold9356|12446|12517| DNA_SRC: Scaffold9356 START: 12446 STOP: 12517 STRAND: + 
CTGTAAGAGTAAAACTTGAAAGCGCCCTCGGTGCCCATGGATGCTCCACCTCCATATGCTCCTCCCTTCT
CC
>GLEAN3_13144|Scaffold9356|13299|13352| DNA_SRC: Scaffold9356 START: 13299 STOP: 13352 STRAND: + 
CTGATTTCCCGATGCAGGTATTTGGCAGACATCAAACGGGCAAGGACCCTCAAC
>GLEAN3_13144|Scaffold9356|14762|14873| DNA_SRC: Scaffold9356 START: 14762 STOP: 14873 STRAND: + 
CTGTGTTAAATGTAGACTGTCTCCTAAGGGTGAGCCTGGTAGATTATCTAGAAACCTTGTCAGCTGATTG
GCTGCTTGATCCACGCCTTCTGGGCTGGAGTTGACTGCACAT
>GLEAN3_13144|Scaffold9356|15689|15789| DNA_SRC: Scaffold9356 START: 15689 STOP: 15789 STRAND: + 
CTCATGTTGGTCTTGTTTAGTACCAAGGAGGCAATAGTCTGAAGGTGGGCCAAGACTGGGTCCAAGTTCT
CCTTCTCGGCAAGTCCTTTTAGGAATGACAC

###Gene_Info_Comments GLEAN3_25459 ###
partial CDS, as inferred from alignment with best blast hit.  The following exon sequence is not conserved and may not be part of this gene.
>GLEAN3_25459|Scaffold2341|6491|6568| DNA_SRC: Scaffold2341 START: 6491 STOP: 6568 STRAND: + 
ACTGCCATGCCTGGTATGAAGCGGGACTGCGGTGGCGCAGCAGCGATTTTGGGTGCATTCTATGCAGCCG
TTAAAGAA

###Gene_Info_Comments GLEAN3_13676 ###
Intron was modified to a coding region by comparison to the corresponding FgenesAB, ++ and Genscan prediction.
###Gene_Info_Comments GLEAN3_25520 ###
This model contains to predicted exons that may not belong to this gene, because they are not conserved, while flanking exons are strongly conserved.  The first of these was also included in a haplotype, GLEAN3_25459.
>GLEAN3_25520|Scaffold32958|7581|7658| DNA_SRC: Scaffold32958 START: 7581 STOP: 7658 STRAND: + 
ACTGCCATGCCTGGTATGAAGCGGGACTGCGGTGGCGCAGCAGCGATTTTGGGTGCATTCTATGCAGCCG
TTAAAGAA
>GLEAN3_25520|Scaffold32958|12680|12777| DNA_SRC: Scaffold32958 START: 12680 STOP: 12777 STRAND: + 
GCGATGGCAAACTATCACTTCCTCGTCGATCAGAATGTATACGCTATCTTTCCTCAGCTTCGCTTCGGAA
AGATAGCTACATCCAGATCGCCTCGGAA

###Gene_Info_Comments GLEAN3_13751 ###
This gene model may represent a pseudogene or contain a sequence error. Intron sequence matches coding sequence of other Sp-Tlr genes and contains stop codons. Modified gene model has some frame shifts, but reflects best gene structure.
This is a member of sea urchin specific Tlr Group IA.
###Gene_Info_Comments GLEAN3_03749 ###
This model may encode two similar adjacent genes, but it is difficult to annotate because there are some exact amino acid duplications in different parts, raising the possibility of an assembly error. 
###Gene_Info_Comments GLEAN3_23329 ###
Adjacent to this model is a very closely related gene, GLEAN3_23330.
###Gene_Info_Comments GLEAN3_09159 ###
Both ends of the gene run are at scaffold boundaries and the CDS is incomplete.  I have added 2 exons in the Fgenes H prediction that blast to the gaps in alignments of the GLEAN3 prediction.  One EST
###Gene_Info_Comments GLEAN3_23330 ###
This gene model is adjacent to GLEAN3_23329, which encodes a very similar protein.
###Gene_Info_Comments GLEAN3_12617 ###
Might be missing exons (from best hit alignment, position on contig).
###Gene_Info_Comments GLEAN3_20050 ###
Phylogenetic analysis using the PTPc domains shows that this is an orthologue of the human PTPRs D, F, and S.  The missing extracellular portion of this gene may correspond to GLEAN3_23737.  This sequence has not been added to the GLEAN3_20050 sequence, since cloning has not yet verified this relationship.
###Gene_Info_Comments GLEAN3_27234 ###
Only exons 1-3 are present on this scaffold21. Exons 4-5 are missing. Exons 4-5 are present on scaffold1351 GLEAN3_14051 but exons 1-3 are missing.
###Gene_Info_Comments GLEAN3_15371 ###
The predicted exons 2, 3, 4, 14 may not belong to this gene, 
as inferred by alignment to best blast hit.

Exon2
>GLEAN3_15371|Scaffold2|816312|816482| DNA_SRC: Scaffold2 START: 816312 STOP: 816482 STRAND: + 
GTTGAAGCAAAGATTCGAATCACCGAGTTTGATTCTGAATCACGTCGGACTGCTCAGACCTACTACCATA
CCTACCCTTCTCATCAGATCTCATACGATGTCAGACGTGAACTCGAAGCTATAGCCTCGACATCGGGTTC
GCCCACCTACGTGGACGAAGTCACGCAAGAG

Exon3

>GLEAN3_15371|Scaffold2|818170|818272| DNA_SRC: Scaffold2 START: 818170 STOP: 818272 STRAND: + 
CTAGAGGATGTGAAGTCTCGAATGGAGCAGAGATATCACACGGCTAAGGTATGCAGGAAGAAAGGTAGAC
GAGCACGGAAAGAATGTCTGCGTCTAGATCCAG

Exon4

>GLEAN3_15371|Scaffold2|821627|821929| DNA_SRC: Scaffold2 START: 821627 STOP: 821929 STRAND: + 
GACTAGAAGAAGACTGTGCATTGGTATTCAAGAAACGAAGATGCGTAATCATGACAATGCTACGAGATGA
GAGATTATATGTTTACTGCATGCAGGCTGTGATGAAAGCTAGGAAAGTGGATCAAGGGTCCCTTTATTTG
TATGTTATGATCAATTATTTGAAGGAAGGTAGTGATGACAAGATTGGTGATGATCATAATGCCATGGACT
TTGACCGGACATCATGGTCAAGGACCATGGTTAATCTCACGAAATGGAAAGTAGCAATGAAGCGAAACGA
TGAGGGAGGATGTACTTACGAGG

Exon14

>GLEAN3_15371|Scaffold2|831817|831927| DNA_SRC: Scaffold2 START: 831817 STOP: 831927 STRAND: + 
TGATATGCGTGAGGCAAACACCATTGGTGCCGATAAGTACTTCCATGCCCGGGGCAACTTCGACGCTGCT
CAGCGAGGATCAGGAGGTAGATTCGCTGCCGAGGTTATCAG

###Gene_Info_Comments GLEAN3_16055 ###
Added 3prime UTR based on EST evidence.
N-terminus is missing due to end of contig.
###Gene_Info_Comments GLEAN3_23767 ###
Exons 6-23 are accepted on the + strand from the GLEAN3_23767 predictions. Exons 2-5 are present on the - strand between exons 7 and 8. Exon 1 is from scaffold59902 but there was no GLEAN3 prediction for it, but it was predicted by FirstEF.
###Gene_Info_Comments GLEAN3_28077 ###
partial CDS at end of scaffold
###Gene_Info_Comments GLEAN3_23734 ###
See GLEAN3_17211. 
###Gene_Info_Comments GLEAN3_01047 ###
partial CDS on short scaffold, 21370
###Gene_Info_Comments GLEAN3_07431 ###
TWO COMMENTS:

1)partial CDS; missing N-terminal sequence as judged by alignment with best blast hit
2)In addition to two glean3 copies, there is an excellent match on scaffold 123301, 194-227 that is not on the glean3 list.
###Gene_Info_Comments GLEAN3_02912 ###
partial CDS; scaffold 52017 is short
###Gene_Info_Comments GLEAN3_27157 ###
partial CDS; scaffold 66222 is short.
###Gene_Info_Comments GLEAN3_15029 ###
Intronless Toll-like receptor with predicted LRR-NT, LRR(22), LRR-CT, TM and TIR.
This is a member of sea urchin specific Tlr Group IA.
###Gene_Info_Comments GLEAN3_15132 ###
This gene model may represent a pseudogene or contain a sequence error. Intron sequence matches coding sequence of other Sp-Tlr genes and contains stop codons. Modified gene model has some frame shifts, but reflects best gene structure.
###Gene_Info_Comments GLEAN3_28898 ###
This gene was annotated based on a manual inspection of multiple sequence alignments.

Note that slightly different models were created by other predictions for this gene. The Glean model provides the best alignment with vertebrate TRAF6, and it was therefore accepted in its original version.

Also note there is a gap in the alignment that is introduced by extra sequence in the glean model. There is no a priori computational evidence to suggest this extra sequence is not real.
###Gene_Info_Comments GLEAN3_16468 ###
Unknown sequence (NNN...) in the intron of the current model could make this gene model incomplete. But The intron sequence except NNN matches coding sequence of other Sp-Tlr genes.
###Gene_Info_Comments GLEAN3_16501 ###
This gene model may represent a pseudogene or contain a sequence error. Intron sequence matches coding sequence of other Sp-Tlr genes and contains stop codons. Modified gene model has some frame shifts, but reflects best gene structure.
###Gene_Info_Comments GLEAN3_26495 ###
This gene was annotated based on manual inspection of multiple sequence alignments.

A slightly different model was created by the NCBI prediction; however, the pairwise alignment with vertebrate TRAF3 is significantly better for the glean model. 
###Gene_Info_Comments GLEAN3_17180 ###
This gene model may represent a pseudogene or contain a sequence error. Intron sequence matches coding sequence of other Sp-Tlr genes and contains stop codons. Modified gene model has some frame shifts, but reflects best gene structure.
###Gene_Info_Comments GLEAN3_17529 ###
Intronless Toll-like receptor with predicted secretory signal peptide, LRR-NT, LRR(22), LRR-CT and TIR.
This is a member of sea urchin-specific Tlr Group I(orphan).

###Gene_Info_Comments GLEAN3_18055 ###
54 nucleotides encoding a predicted signal peptide were added at the 5'end of the GLEAN3 model by comparison to the corresponding FgeneshAB and ++ prediction. 
###Gene_Info_Comments GLEAN3_08332 ###
This gene was annotated based on a manual inspection of multiple sequence alignments.

Even though the available information is sufficient to confidently assign this gene as Sp-Traf4, there is also strong computational evidence to suggest that there is missing sequence towards the N-terminus of this gene, which is supported by the fact that this model is located at one end of its respective scaffold.

A slightly different model was created based on a Fgenesh++ prediction; however, the glean/NCBI model shows a better alignment to vertebrate TRAF4.
###Gene_Info_Comments GLEAN3_03462 ###
This gene was annotated based on a manual inspection of multiple sequence alignments.

In a multiple sequence alignment that included vertebrate and Drosophila TRAF family sequences, this gene failed to cluster with any specific family member. Thus, we have decided to name this gene with an arbitrary classifier ("A").

Slightly different models were created by other predictions; however the exonic structure of the glean model is very strongly supported by the tiling array data. The array data also suggest there might be additional sequence absent from the current glean model, which coincides with gaps in the alignment with vertebrate TRAFs. However, at present we have no evidence to determine whether the model should indeed be accordingly modified.

###Gene_Info_Comments GLEAN3_25346 ###
Inspection of the tiling array suggests that glean may have missed the following exons: LVKSIGLYTYGLLLLLSSIQLLTAVRSMVKTIAHGDLQTVPFHMTKSRVIVQSRGHMEAIVLTKSWRKAVNICVINRICSVNPSKRDILITYT
###Gene_Info_Comments GLEAN3_23069 ###
This gene was annotated based on a manual inspection of multiple sequence alignments to family members in other animal groups.

This sequence did not cluster with any specific family member in a multiple alignment tree, and was therefore named with an arbitrary identifier ("B").

Most exons in the present model are supported by the genome-wide tiling array data. Based on the same array data, there seem to be some exons that may have been erronously left out from the model. We have no evidence to independently support such possibility, and we have therefore accepted the glean model in its present form.

Also note that the first exon in the glen model codes for some very low complexity aminoacidic sequence. Exon 1 in the corresponding NCBI model does not include such sequence. However, the transcription of this sequence is supported by the tiling array data. Thus, it is highly likely that the most upstream sequence in exon 1 of the glean model is in fact 5' UTR, and that the true CDS corresponds to that of the NCBI model's exon 1. At the present time we cannot support any additional evidence to support this possibility.
###Gene_Info_Comments GLEAN3_18100 ###
The first exon was eliminated and 117 nucleotides encoding a predicted signal peptide were added at the 5'end of the second exon of the GLEAN3 model by comparison to the corresponding FgeneshAB prediction.
This is a member of sea urchin-specific Tlr Group IA.
###Gene_Info_Comments GLEAN3_18211 ###
Unknown sequence (NNN...) in the intron of the current model could make this gene model incomplete. The nucleotides in the intron except NNN... have highly similar to other Sp-Tlr genes, so it was modified to a coding region.
###Gene_Info_Comments GLEAN3_18212 ###
Unknown sequence (NNN...) in the intron of this gene model could make this gene model incomplete. The nucleotides in the first intron except NNN... have highly similar to other Sp-Tlr genes, so it was modified to a coding region and third exon was eliminated.
###Gene_Info_Comments GLEAN3_18409 ###
240 bp intron was accepted as coding region by comparison to the corresponding FgenesAB prediction.
###Gene_Info_Comments GLEAN3_11877 ###
The 5' end of the gene is incomplete, there are probably 1 or 2 exons (140 amino acids) missing.  The 6 ESTs are up to 3.5 kb upstream and only UTR sequence seems to be available.  There is embyronic expression data for 5' sequences.
###Gene_Info_Comments GLEAN3_20620 ###
Exons 8-19 are from this scaffold57107 and GLEAN3_20620 prediction. Exons 3-7 are from scaffold533 and GLEAN3_23989 predictions except for exon 5 which was only predicted by the Fgenesh++ prediction. Exon 2 is from scaffold65249 with no tracks predicting it. Exon 1 is incomplete and present on scaffold137005 with no tracks predicting it.
###Gene_Info_Comments GLEAN3_25471 ###
This gene was annotated with A.pectinifera mRNA and peptide.  The complete annotation of this gene is on Glean3_12078.  This glean aligns up with ApIP3R AAs 1247-1632.
###Gene_Info_Comments GLEAN3_24561 ###
The mRNA and pepetide sequence of A. pectinifera were used to check the Glean prediction.  The complete annotation of this gene is on Glean3_12078.  Of note, two exons predicted are in sequential order, but code for the exact same nts. of the protein.  Also, not all the predicted exons were used in the full annotation under Glean3_12078.  This glean aligns up with ApIP3R AAs 116-403.
###Gene_Info_Comments GLEAN3_07452 ###
The first exon of Spec1 contains 5'UTR and one aminoacid: Met.GLEAN3_07452 does not show the first exon.
###Gene_Info_Comments GLEAN3_07449 ###
This predicted gene GLEAN3_07449 matches to exon 4 to 6 of Spec2c gene. The other predicted gene GLEAN3_14607 matches to exon 1 to 3 of Spec2c.And, there are 10 Amino Acid that cannot match to any predicted genes. 
###Gene_Info_Comments GLEAN3_14607 ###
This predicted gene matches to exon 1 to 3 of Spec2c. Another predicted gene GLEAN3_07449 matchs to exon 4 to 6 of Spec2c. For more information of Spec2c, please search GLEAN3_07449 to see details.
###Gene_Info_Comments GLEAN3_03175 ###
First exon and last exon of Spec2d gene is not predicted in
GLEAN3_03175 
###Gene_Info_Comments GLEAN3_06513 ###
Duplicated piece matching nucleotide numbers 11549-12671 of this scaffold (#56396) on scaffold #33010 (nucleotide numbers 1-1123, Glean3_23676)
###Gene_Info_Comments GLEAN3_07655 ###
Additional evidences of the existence of the gene have been obtained in Sphaerechinus granularis
GLEAN3_21839 encodes a partial 3'-terminal sequence of the mRNA
Position of the mRNA 5'end has been deduced from Sp ESTs (i.e. CX678933.1)
Tentative assignment of the 3'UTR end position by computational methods and comparison with S. granularis mRNA
###Gene_Info_Comments GLEAN3_03332 ###
GLEAN3_03332 and the neighboring prediction GLEAN3_03333 are partial CDS and may result from assembly problems.  The first two exons of 03332 are exact repeats and this sequence appears again in 03333.
###Gene_Info_Comments GLEAN3_21022 ###
Alignment data suggests this model contains complete CDS.
###Gene_Info_Comments GLEAN3_00075 ###
Alignment with best blast hit suggests that the model may be missing N-terminal sequences
###Gene_Info_Comments GLEAN3_15452 ###
Alignment with best blast hit suggests that model may be missing N-terminal sequence.  All exons except for the first one (see below) of the model blast to methionyl aminopeptidase.  This first exon may or may not be correct.
>GLEAN3_15452|Scaffold1614|92736|92855| DNA_SRC: Scaffold1614 START: 92736 STOP: 92855 STRAND: + 
ATGTCTTTCAACAGCTACAGAAAACCCAGACCAGAACAGCTTTCAATATCTTCCAGAAATGGGGCAAAAA
AGCAGAGCCAACAACACCCCAGTAACTTCTCAATTGTTCAGGCAGGAAAG
###Gene_Info_Comments GLEAN3_14474 ###
Partial CDS, because it is on a short scaffold. Alignment with best blast sequences suggests that each of the 4 exons of this model is similar to sequences in the same protein, methionyl aminopeptidase.
###Gene_Info_Comments GLEAN3_06794 ###
Alignment with best blast hit suggests that several internal exons may be missing from the model.
###Gene_Info_Comments GLEAN3_19205 ###
Alignment to best blast hit suggests that there may be two missing internal exons in this model
###Gene_Info_Comments GLEAN3_00619 ###
partial CDS, missing C-terminal sequences, based on alignment with best blast hit sequence
###Gene_Info_Comments GLEAN3_14274 ###
Partial CDS based on alignment with best blast hit sequence
###Gene_Info_Comments GLEAN3_08606 ###
Alignment with best blast hit sequence suggests that model is complete.
###Gene_Info_Comments GLEAN3_11258 ###
Partial CDS
###Gene_Info_Comments GLEAN3_19102 ###
TWO COMMENTS:

1) Alignment with best blast hit sequence suggests that this model may lack N-terminal sequence.
2) This model is nearly identical to the adjacent model GlEAN3_19101 on Scaffold 112071
###Gene_Info_Comments GLEAN3_10630 ###
Alignment with best blast sequence suggests that this model is complete.
###Gene_Info_Comments GLEAN3_02994 ###
Alignment with best blast hit sequence suggests that C-terminal exon(s) may be missing from the model.
###Gene_Info_Comments GLEAN3_18410 ###
Intronless Toll-like receptor with LRR-NT, LRR(11 to 22), LRR-CT, TM and TIR. 
###Gene_Info_Comments GLEAN3_23048 ###
Partial CDS as suggested by alignment to best blast hit sequence
###Gene_Info_Comments GLEAN3_18519 ###
This gene model was modified as intronless Toll-like receptor by comparison to the corresponding FgenesAB and Genscan prediction.
This is a member of sea urchin specific Tlr Group IA.
###Gene_Info_Comments GLEAN3_02598 ###
Alignment with best blast hit sequence suggests that both N- and C-terminal exons are missing from the model
###Gene_Info_Comments GLEAN3_18534 ###
Unknown sequence (NNN...) in the 5'region of the current model could make this gene model incomplete.
###Gene_Info_Comments GLEAN3_28736 ###
There is an excellent, although short, match on Scaffold6547 that is not on the glean3 list.

Alignment with best blast hit data suggest that this model may lack N-terminal sequence.
###Gene_Info_Comments GLEAN3_18838 ###
This gene model may represent a pseudogene or contain a sequence error. Intron sequence matches coding sequence of other Sp-Tlr genes and contains stop codons. The fist intron is similar to a coding region of other Sp-Tlr gene, but the second is not.
###Gene_Info_Comments GLEAN3_18928 ###
471 bp intron was accepted as coding region by comparison to the corresponding FgenesAB and ++ prediction.
###Gene_Info_Comments GLEAN3_19042 ###
Unknown sequence (NNN...) in the intron of the current model could make this gene model incomplete. The intron was accepted to a coding region. 
This is a member of sea urchin-specific Tlr Group IA.
###Gene_Info_Comments GLEAN3_19834 ###
This gene model may represent a pseudogene or contain a sequence error. Intron sequence matches coding sequence of other Sp-Tlr genes and contains stop codons. Modified gene model has some frame shifts, but reflects best gene structure.
###Gene_Info_Comments GLEAN3_20996 ###
Intronless Toll-like receptor with predicted secretory signal peptide, LRR(5 to 10), LRR-CT, TM and TIR.
###Gene_Info_Comments GLEAN3_20997 ###
Intronless Toll-like receptor with predicted secretory signal peptide, LRR(5 to 10), LRR-CT, TM and TIR.
###Gene_Info_Comments GLEAN3_21162 ###
Unknown sequence (NNN...) in the intron of the current model could make this gene model incomplete. This gene is at the end of scaffold.
###Gene_Info_Comments GLEAN3_21225 ###
This gene model may represent a pseudogene or contain a sequence error. Intron sequence matches coding sequence of other Sp-Tlr genes and contains stop codons. Modified gene model has some frame shifts, but reflects best gene structure.
###Gene_Info_Comments GLEAN3_22302 ###
5' end of the gene (200 aa) is missing from the scaffold end.  There is an allele or a paralogue predicted by GLEAN3_02273 Scaffold4501.  
###Gene_Info_Comments GLEAN3_02273 ###
This appears to be 575-840 of SpAlpha-J.  Either an allele or a duplication
###Gene_Info_Comments GLEAN3_26617 ###
Alignment with best blast hit sequence suggests the gene model may lack N-terminal sequence, but otherwise appears to be complete.

There are excellent matches to sequences in this model on scaffold64641_1, 766-1098 and on scaffold 32233, 2047-2418.  Neither of these is on the glean3 list.
###Gene_Info_Comments GLEAN3_04913 ###
Alignment with best blast hit sequence suggests that N-terminal sequence may be missing, but otherwise model appears to be complete.
###Gene_Info_Comments GLEAN3_21008 ###
This appears to be a good prediction for the half of an alphaV,5-like subunit
###Gene_Info_Comments GLEAN3_21395 ###
Partial Toll-like receptor. This gene model is located at the end of the scaffold.
###Gene_Info_Comments GLEAN3_21415 ###
546 bp intron and 63bp 5'UTR were accepted as coding region by comparison to the corresponding FgenesAB and ++ prediction.
###Gene_Info_Comments GLEAN3_21787 ###
Unknown sequence (NNN...) in the 5' upstream of the current model could make this gene model incomplete. 331bp of 5'UTR next to NNN... was accepted to a coding region. The modified gene model has a stop codon, but reflects best gene structure.
###Gene_Info_Comments GLEAN3_15735 ###
Likely only partial gene.
###Gene_Info_Comments GLEAN3_21936 ###
This gene model may represent a pseudogene or contain a sequence error. Intron sequence matches coding sequence of other Sp-Tlr genes and contains stop codons. Modified gene model has a stop codon, but reflects best gene structure.
This is a member of sea urchin specific Tlr Group IA.
###Gene_Info_Comments GLEAN3_23090 ###
The 5' end of this protein seems to be from some other gene, but the mid-3' end corresponds to Prickle.  
###Gene_Info_Comments GLEAN3_25302 ###
1) GLEAN3_22817 shows a perfect match to the N-terminal region of Sp-Alx1. The scaffold ends within a large intron in the Sp-Alx1 gene.
2) The best Genbank hit (XP_785238) is to GLEAN3_ 22816, a closely related gene. Note that GLEAN3_22816 and GLEAN3_22817 are on the same scaffold (Scaffold 260).
###Gene_Info_Comments GLEAN3_22451 ###
This gene model may represent a pseudogene or contain a sequence error. Intron sequence matches coding sequence of other Sp-Tlr genes and contains stop codons. Modified gene model has some frame shifts, but reflects best gene structure.
This is a member of sea urchin-specific Tlr Group IC.
###Gene_Info_Comments GLEAN3_12766 ###
This Glean was checked with alignment to A. pectinifera mRNA and peptide.  The full annotation of this gene is in Glean3_12078.  This Glean aligns up with ApIP3R on AAs 1-93.
###Gene_Info_Comments GLEAN3_03264 ###
haplotype=GLEAN3_08197
###Gene_Info_Comments GLEAN3_02129 ###
Scaffold 1876 hit the first 352 basepairs (1-352) of Sp-Not mRNA (NM_214562 from NCBI), which might be exon 1. So modified gene feature starts from exon 2.
Frame number is decided by assuming exon 2 is frame 0.

###Gene_Info_Comments GLEAN3_23033 ###
429 bp intron was accepted as coding region by comparison to the corresponding FgenesAB, ++ and Genscan prediction.
###Gene_Info_Comments GLEAN3_10698 ###
Structure similar to Protein Tyrosine Kinase 7 isoform c precursor.
###Gene_Info_Comments GLEAN3_22716 ###
Partial CDS, lacking N-terminal half, as inferred from alignment with best blast hit sequence,
###Gene_Info_Comments GLEAN3_22909 ###
468 bp intron was accepted as coding region by comparison to the corresponding FgenesAB, ++ and Genscan prediction.
###Gene_Info_Comments GLEAN3_25078 ###
Alignment with best blast hit sequence suggests that model is lacking C-terminal 2/3 of gene
###Gene_Info_Comments GLEAN3_12194 ###
Alignment with best blast hit sequence suggests that model is lacking C-terminal half and some N-terminal sequence.
###Gene_Info_Comments GLEAN3_24526 ###
See GLEAN3_26779.  Partial duplication.
###Gene_Info_Comments GLEAN3_19976 ###
Alignment with best blast hit sequence suggests that model lacks N-terminal half of the gene.
###Gene_Info_Comments GLEAN3_02093 ###
Alignment with best blast hit sequence suggests that the model lacks both N- and C-terminal sequences.

There is an excellent, although short, match on Scaffold 1524_1, 455-727, that is not on the glean3 list.
###Gene_Info_Comments GLEAN3_23544 ###
This gene model may represent a pseudogene or contain a sequence error. Intron and 3'UTR sequence matches coding sequence of other Sp-Tlr genes and contains stop codons. Modified gene model has some frame shifts, but reflects best gene structure.
This is a member of sea urchin-specific Tlr Group IA.
###Gene_Info_Comments GLEAN3_24032 ###
Alignments and tiling data suggest that two additional exons are possible at 3'end- position 10,010-10,144 and 13,805-13,986. 
###Gene_Info_Comments GLEAN3_06768 ###
Repair polymerase. Conducts "gap-filling" DNA synthesis in a stepwise distributive fashion rather than in a processive fashion as for other DNA polymerases. Has a 5'-deoxyribose-5-phosphate lyase (dRP lyase) activity (By similarity).
###Gene_Info_Comments GLEAN3_24815 ###
429 bp intron was accepted as coding region by comparison to the corresponding FgenesAB, ++ and Genscan prediction.
###Gene_Info_Comments GLEAN3_25076 ###
Partial Toll-like receptor. This gene model is located at the end of a small scaffold.
###Gene_Info_Comments GLEAN3_08512 ###
Alignmenet with best blast sequence suggests that the model may lack N- and C-terminal sequences.

There are a large number of nearly identical sequences on many  scaffolds that are not on the glean3 list.  There are also multiple copies on the glean3 list.
###Gene_Info_Comments GLEAN3_08512 ###
 fragment
###Gene_Info_Comments GLEAN3_25136 ###
Unknown sequence (NNN...) in the intron and the small scaffold could make this gene model incomplete. The third exon was accepted as the coding region of a partial Toll-like receptor.
This is a member of sea urchin-specific Tlr Group ID.
###Gene_Info_Comments GLEAN3_25263 ###
Partial Toll-like receptor. This gene model is located at the end of a small scaffold.
This is a member of sea urchin-specific Tlr Group IA.
###Gene_Info_Comments GLEAN3_25312 ###
Unknown sequence (NNN...) in the intron of the current model could make this gene model incomplete. Intron sequence except NNN is highly similar to other Sp-Tlr genes, so it was accepted as a coding region. 
This is a member of sea urchin-specific Tlr Group IA.
###Gene_Info_Comments GLEAN3_25719 ###
507 bp and 549bp introns and 426bp of the 5'UTR were accepted as coding region by comparison to the corresponding FgenesAB and ++ prediction.
This is a member of sea urchin-specific Tlr Group IA.
###Gene_Info_Comments GLEAN3_28940 ###
Partial CDS.  The model may lack N-terminal half sequences.

This model of one of a cluster of 4 closely related NAALAD2 genes on scaffold 496.
###Gene_Info_Comments GLEAN3_09002 ###
Exons 8-28 are present on this scaffold772 and GLEAN3_09002 prediction. Exons 1-7 are present on scaffold772 and GLEAN3_20718 prediction
###Gene_Info_Comments GLEAN3_28938 ###
Alignment to best blast sequence suggests that this model may be complete, possibly lacking sequences at the N-terminal end.

This is one of 4 very similar NAALADase genes on scaffold 496.
###Gene_Info_Comments GLEAN3_28939 ###
Alignment with best blast sequence suggests that this model lacks C-terminal sequences.

This is one of 4 closely related NAALADase genes on Scaffold 496.
###Gene_Info_Comments GLEAN3_28937 ###
Alignment with best blast sequence suggests that this model may be complete.

This is one of 4 closely related NALAADase genes on Scaffold496.
###Gene_Info_Comments GLEAN3_26274 ###
This gene model may represent a pseudogene or contain a sequence error. Intron sequence matches coding sequence of other Sp-Tlr genes and contains stop codons. Modified gene model has some frame shifts, but reflects best gene structure.
###Gene_Info_Comments GLEAN3_09215 ###
This sequence is a composite of the original Glean_09215 plus the Glean_19261 sequences.  Although these sequences are on separate scaffolds, they appear to be complementary.  Alternatively, it is possible that they are from two different forms of Fmi.
###Gene_Info_Comments GLEAN3_15929 ###
Exons 8-24 are present on this scaffold60165 and modified GLEAN3_15929 prediction.  Modified GLEAN3_15930 and GLEAN3_15931 have been merged into the GLEAN3_15929 prediction. Exons 3-7 are from scaffold41327 and GLEAN3_23590 prediction.  Exons 1-2 are missing.
###Gene_Info_Comments GLEAN3_26275 ###
This gene model may represent a pseudogene or contain a sequence error. Intron sequence matches coding sequence of other Sp-Tlr genes and contains stop codons (frame shift).
###Gene_Info_Comments GLEAN3_27163 ###
This gene model may represent a pseudogene or contain a sequence error. Intron and 5'UTR sequences match coding sequences of other Sp-Tlr genes and contains stop codons. Modified gene model has some frame shifts, but reflects best gene structure.
###Gene_Info_Comments GLEAN3_16406 ###
CDS of positive strand of this gene align exactly with CDS of the minus strand of Glean3_16409, same scaffold. Sequence between CDS as well as the repeat structure in the two regions is very similar as well. Glean3_16409 could be a haplotype that was put on the wrong scaffold. The contig that it is on is attached to the very end of the scaffold right next to this gene. 
###Gene_Info_Comments GLEAN3_18375 ###
complete gene model : GLEAN3_18375 + GLEAN3_23408
###Gene_Info_Comments GLEAN3_11225 ###
Alignment with best blast sequence suggests that the model lacks the N-terminal half of the gene.
###Gene_Info_Comments GLEAN3_23984 ###
Alignment with best blast sequence suggests that the model lacks the N'terminal half of the CDS.
###Gene_Info_Comments GLEAN3_23985 ###
FOUR COMMENTS: 

!) Alignment with best blast sequence suggests that this model may be a complete NAALAD2 gene, possibly lacking short N- and C-terminal sequences.

2) It is linked to a reverse transcriptase elements which may not be part of the gene.  This element is on the following predicted exon:
>GLEAN3_23985|Scaffold113776|19389|20291| DNA_SRC: Scaffold113776 START: 19389 STOP: 20291 STRAND: + 
TTTCCTTTCCGACCGTCATCAGGTGGTTCGACATCAAGGCGTGACATCAAGCCCAAAGAGCCTCGCATGC
GGAGTACCTCAGGGTACAAAGTTGGGACCAATCCTATTCCTTGCCCTTGTTAACGATGCTGCCTTAACGT
CAACATACCGATGGAAGTATGTTGACGACTTAAGTTTGGTGGAAGTCTTGCCTAAAACCCAGCAAAGTTC
CTTACAGGAGTACGTTGATGAGCTCGGTGAATGGTGCGCCATTAATGACGTGACGCCAAAGCCCGAAAAA
TGTAAGGCCATGCAAGTGTCTTTCTTGAAGAATCCTCTTCCTCATTTGGACATCACCATCGCAGATGTTC
ATCTTGAACGTGTTGATTCCTTGACTCTCCTTGGTGTCGCGATCCAATCAGACCTGAAATGGGATAATCA
GGTCCAACAGATGATCTCACGGGCCGCTCGGAGACTGTACATTCTGAGTGTTCTGAAGAAATCTGGAGTC
AACGCGAATGATCTAGTAACCATCTACAAAGCGTATATCCGTCCCCTGATGGAATTTGGTGTCCCTGTCT
GGGGCTCCGGCATTACTAATACGCAGAGTGATAAAATCGAACGAATCCAAAGACGTGCGCTACGTTTCAT
TGTGTATCCAGCTGACCTCTCCTACACACAACGGCTCACTCGTTTCAACTTGCCTATGTTGTGTGAACGC
AGGAATGATCTCCTTCTACGCTTTGGACGTGGTCTCCTCAAGTCTGAACGGCATCGTGACATGCTACCTG
CTACTCGTCAATGTGTCTCTCACCGCAGTTCAACACTGAGAAGTGCTCATCTACTAGACCTACAGCGTTG
TAAAACCCAACGATATAGGAACTCTGCAATCCCGTTTTTAACACGAATGCTCAATTCTTCCAA

3) This model is adjacent to a closely related gene model, GLEAN3_23984, which lacks N-terminal half sequence.

4) There are many excellent matches to this model in scaffolds that are not on the glean 3 list.  In addition there are 3 copies in the glean3 list.
###Gene_Info_Comments GLEAN3_26984 ###
Alignment with best blast sequence suggests that this model may be missing N-terminal CDS.
###Gene_Info_Comments GLEAN3_14154 ###
Alignment with best blast sequence suggests that the models is missing N-terminal CDS.
###Gene_Info_Comments GLEAN3_04980 ###
Alignment with best blast sequence suggets that this model may be missing N-terminal CDS.
###Gene_Info_Comments GLEAN3_14945 ###
Alignment with best blast sequence suggests that this model may be complete.
###Gene_Info_Comments GLEAN3_27698 ###
Partial TLR. The locus of the gene model is at the end of the scaffold.
###Gene_Info_Comments GLEAN3_27721 ###
Unknown sequence (NNN...) in the intron of the current model could make this gene model incomplete. The intron sequence except NNN... is highly similar to other Sp-Tlr genes, so it was accepted to a coding region.
This is a member of sea urchin-specific Tlr Group I(orphan).

###Gene_Info_Comments GLEAN3_27798 ###
This gene model may represent a pseudogene or contain a sequence error. Intron sequence matches coding sequence of other Sp-Tlr genes and contains stop codons. Modified gene model has a stop codon, but reflects best gene structure.
This is a member of sea urchin-specific Tlr Group IA.
###Gene_Info_Comments GLEAN3_27815 ###
This gene model may represent a pseudogene or contain a sequence error. 5'UTR sequence doesn't match coding sequence of other Sp-Tlr genes.
This is a member of sea urchin-specific Tlr Group IE.
###Gene_Info_Comments GLEAN3_28404 ###
Intron sequence was accepted to a coding region that could make TIR domain complete.
This is a member of sea urchin-specific Tlr Group I(orphan).

###Gene_Info_Comments GLEAN3_21404 ###
Alignment with best blast hit suggests that this model is missing C-terminal CDS.  The best blast hit is huge, >2200 aa.
###Gene_Info_Comments GLEAN3_06660 ###
Alignment with best blast sequence suggests model is missing large N- and C-terminal  CDSs.
###Gene_Info_Comments GLEAN3_17421 ###
There is a family of carbonic anhydrase genes in the sea urchin. These have not  been carefully compared to potential vertebrate orthologs. 
###Gene_Info_Comments GLEAN3_00851 ###
Alignment with best blast sequence suggests that this model lacks the N-terminal half of a huge protein, >2200 amino acids
###Gene_Info_Comments GLEAN3_04135 ###
A family of carbonic anhydrase-like proteins exists in sea urchin. Orthologous relationships with vertebrate carbonic anhydrases and related proteins have not been carefully examined.
###Gene_Info_Comments GLEAN3_02570 ###
Alignment with best blast sequence indicates that this model includes only a small central portion of the CDS.
###Gene_Info_Comments GLEAN3_00871 ###
Unknown sequence (NNN...) in the intron of the current model could make this gene model incomplete. There are duplicated 600bp senquence in this gene model.
This is a member of sea urchin-specific Tlr Group IIA.
###Gene_Info_Comments GLEAN3_07418 ###
Partial Toll-like receptor. The locus of this gene model is at the end of a scaffold.
###Gene_Info_Comments GLEAN3_07859 ###
Intronless Toll-like receptor with predicted secretory signal peptide, LRR-NT, LRR(14), LRR-CT(2), TM and TIR. 
###Gene_Info_Comments GLEAN3_04136 ###
Annotated gene shows coordinates and sequences of the "long form" of SpP19. There is a shorter, alternatively spliced form (see AF519413).
###Gene_Info_Comments GLEAN3_11823 ###
Intronless Toll-like receptor with predicted secretory signal peptide, LRR-NT, LRR(11 to 18), LRR-CT(2) and TIR. 
###Gene_Info_Comments GLEAN3_14191 ###
This gene model was modified based on Fgenesh++ prediction. Toll-like receptor with predicted secretory signal peptide, LRR-NT, LRR(24), LRR-CT, TM and TIR.
This is a member of sea urchin-specific Tlr Group IB. 
###Gene_Info_Comments GLEAN3_20259 ###
This gene model may represent a pseudogene or contain a sequence error. Intron sequence matches coding sequence of other Sp-Tlr genes and contains stop codons. Modified gene model has some frame shifts, but reflects best gene structure.This is a member of sea urchin-specific Tlr Group IB. 

###Gene_Info_Comments GLEAN3_24404 ###
Unkown sequence (NNN) in the intron could make the gene model incomplete. Modified model was accepted as an intronless Toll-like receptor with predicted secretory signal peptide, LRR-NT, LRR(12 to 22), LRR-CT, TM and TIR.  
###Gene_Info_Comments GLEAN3_24960 ###
Toll-like receptor with predicted secretory signal peptide, LRR-NT, LRR(22), LRR-CT and TIR.
This gene model may represent a pseudogene or contain a sequence error, but reflect best gene structure.
This is a member of sea urchin-specific Tlr Group ID.

###Gene_Info_Comments GLEAN3_27222 ###
The gene model was accepted as a intronless Toll-like receptor with predicted secretory signal peptide, LRR-NT, LRR(11 to 20), LRR-CT, TM and TIR.
###Gene_Info_Comments GLEAN3_02174 ###
Alignment with best blast sequence suggests that this model is missing C-terminal CDS.  The best blast hit is >2200 amino cds and the model is 1492.
###Gene_Info_Comments GLEAN3_26537 ###
Alignment with best blast seuence suggests model may be missing a small C-terminal segment.

This model is adjacent to a short segment of a closely related gene, GLEAN3_26538, on Scaffold107330.
###Gene_Info_Comments GLEAN3_26538 ###
Alignment with best bast sequence suggests that the model contains only a small N-terminal segement of the CDS.

This model is adjacent to a very similar gene, GLEAN3_26537, on Scaffold107330.
###Gene_Info_Comments GLEAN3_23590 ###
Refer to GLEAN3_15929 for complete REJ4 gene features
###Gene_Info_Comments GLEAN3_19049 ###
Alignment with best blast sequence suggests that this model contains only  a short segment, which is repeated in human multifunctional protein CAD.
###Gene_Info_Comments GLEAN3_13209 ###
Alignment with best blast sequence suggests this model could be complete if the predicted N-terminal exon which is not conserved is correct.
###Gene_Info_Comments GLEAN3_21494 ###
Alignment with best blast sequence shows that this model contains only a short conserved segment.

There is a match to sequences on Scaffold72931_1, 262-413, that is not on the GLEAN3 list.
###Gene_Info_Comments GLEAN3_09429 ###
Alignment with best blast sequence suggests that this model may be missing a short N-terminal segment and a longer C-terminal segement.
###Gene_Info_Comments GLEAN3_25325 ###
Alignments with best blast hits suggest that the gene model may be correct.
###Gene_Info_Comments GLEAN3_28671 ###
Alignment with best blast seuence suggests that this model contains only a short conserved sequence.  The following exons are considered unlikely to be part of this model.
>GLEAN3_28671|Scaffold36273|17615|17788| DNA_SRC: Scaffold36273 START: 17615 STOP: 17788 STRAND: + 
TTAGATACCTGGAACCTGGGCAAGTAAGTTGACGTAAGAGCAAGTAATCTTGATCTCACAAGTAATCTTG
ATCTCACAAGTAATCTTGATCTCACAAGTAACCTTGATCTCACAAGTAATCTTGATCTCACAAGTAATCT
TGATCTCACAAGTAATCTTGATCTCACAAGTAAC
>GLEAN3_28671|Scaffold36273|17825|18165| DNA_SRC: Scaffold36273 START: 17825 STOP: 18165 STRAND: + 
CTTGATCTCACAAGTAACCTTGATCTCACTAGTAATCTTGATCTCACAAGTAATCTTGATCTCACAAGTA
ACCTTGATCTCACAAGTAACCTTGATCTCACAAGTAACCTTGATCTCACAAGTAATCTTGATCTCACAAG
TAATCTTGATCTCACAAGTAATCTTGATCTCACAAGTAATCTTGATCTCACAAGTAATCTTGATCTTACA
ACTAACCTTGATCTCACAAGTAACCTTGATCTCACAAGTAATCTTGATCTCACAAGTAACCTTGATCTCA
CAAGTAATCTTGATCTCACAAGTAATCTTGATCTCACAAGTAACCCTTACTGTTATCCTTC

###Gene_Info_Comments GLEAN3_00375 ###
Partial Toll-like receptor. The previous contig (separated by 1056 bp of NNN) contains a part of a common TLR structure.
This is a member of sea urchin-specific Tlr Group ID.  
###Gene_Info_Comments GLEAN3_01458 ###
This gene model may be a short Toll-like recepter.  5'UTR sequence doesn't match coding sequence of other Sp-Tlr genes.
###Gene_Info_Comments GLEAN3_01650 ###
Partial Toll-like receptor. This gene model is located at the end of a short scaffold. The nucleotide sequence  has 95% identity to another Sp-Tlr gene, so it could be a member of Toll-like receptor.
This is a member of sea urchin-specific Tlr Group IA.

###Gene_Info_Comments GLEAN3_01862 ###
Partial Toll-like receptor. This gene model is located at the end of a short scaffold. The nucleotide sequence has 95% identity to another Sp-Tlr gene, so it could be a member of Toll-like receptor.


###Gene_Info_Comments GLEAN3_01993 ###
Partial Toll-like receptor. This gene model is located at the end of a short scaffold. The nucleotide sequence has 99% identity to another Sp-Tlr gene, so it could be a member of Toll-like receptor.


###Gene_Info_Comments GLEAN3_19349 ###
This gene was annotated based on a manual inspection of protein alignments.

Two other adjacent models (GLEAN3_19350 and GLEAN3_19351) code for very similar sequences. It is yet to be determined to which extent this reflects true gene duplications or assembly problems.

Several Sp-IL17 genes were identified computationally. Each of them shows a best Blast-p hit to specific vertebrate IL17 family members. However, multiple sequence alignments demonstrated that almost all the Sp-IL17 sequences segregate from their vertebrate counterparts as a separate cluster. For consistency purposes, therefore, we decided to arbitrarily assign numerical identifiers to all the Sp-Il17 genes, and thus avoid the implication that they all have specific vertebrate orthologs (which are classified by the letters A-F).
###Gene_Info_Comments GLEAN3_02803 ###
Partial Toll-like receptor. This gene model is located at the end of a short scaffold. The nucleotide sequence has 94% identity to another Sp-Tlr gene, so it could be a member of Toll-like receptor.
This is a member of sea urchin-specific Tlr Group IC.

###Gene_Info_Comments GLEAN3_03419 ###
Partial Toll-like receptor. This gene model is located at the end of a short contig. The nucleotide sequence has 96% identity to another Sp-Tlr gene, so it could be a member of Toll-like receptor.


###Gene_Info_Comments GLEAN3_03846 ###
Partial Toll-like receptor. This gene model is located at the end of a short scaffold. The nucleotide sequence has 95% identity to another Sp-Tlr gene, so it could be a member of Toll-like receptor.


###Gene_Info_Comments GLEAN3_04655 ###
Partial Toll-like receptor. This gene model is located at the end of a short scaffold. The nucleotide sequence has 95% identity to another Sp-Tlr gene, so it could be a member of Toll-like receptor. This is a member of sea urchin-specific Tlr Group I(orphan).


###Gene_Info_Comments GLEAN3_05871 ###
This gene was annotated based on a manual inspection of protein alignments.

There might be some missing N-terminus sequence in this model, as suggested by an incomplete alignment to homologous vertebrate sequences, and by signal from the tiling array upstream of the first annotated exon that do not correspond with any other models. Such potential missing sequence was searched computationally, but with no success.
###Gene_Info_Comments GLEAN3_13950 ###
This gene was annotated based on a manual inspection of protein alignments.

We named this gene "Sp-Il1R-rs1" because no other significant Blast hit could be found for it, and because of its overall domain composition, which closely resembles that of a typical IL-1 receptor. In fact, among the Blast hits obtained for this sequence, the one that spans most of the query corresponded to a Gallus gallus predicted IL1RAcP.
###Gene_Info_Comments GLEAN3_05148 ###
Partial Toll-like receptor. This gene model is located at the end of a short scaffold. The nucleotide sequence has 97% identity to another Sp-Tlr gene, so it could be a member of Toll-like receptor.
###Gene_Info_Comments GLEAN3_07991 ###
Unknown sequence (NNN...) in the intron of the current model could make this gene model incomplete. A part of the intron sequence is highly similar to other Sp-Tlr genes, so it was accepted a coding region and the second exon was eliminated. 
###Gene_Info_Comments GLEAN3_08229 ###
Partial Toll-like receptor. This gene model is located at the end of a short scaffold. The nucleotide sequence has 96% identity to another Sp-Tlr gene, so it could be a member of Toll-like receptor.
###Gene_Info_Comments GLEAN3_09343 ###
Partial Toll-like receptor. This gene model is located at the end of a short scaffold. The nucleotide sequence has 96% identity to another Sp-Tlr gene, so it could be a member of Toll-like receptor.
###Gene_Info_Comments GLEAN3_09459 ###
Partial Toll-like receptor. This gene model is located at the end of a short scaffold. The nucleotide sequence has 96% identity to another Sp-Tlr gene, so it could be a member of Toll-like receptor.
This is a member of sea urchin-specific Tlr Group ID.
###Gene_Info_Comments GLEAN3_09933 ###
Partial Toll-like receptor. This gene model is located at the end of a short scaffold. The nucleotide sequence has 94% identity to another Sp-Tlr gene, so it could be a member of Toll-like receptor.
This is a member of sea urchin-specific Tlr Group IB. 
###Gene_Info_Comments GLEAN3_02467 ###
Same gene as GLEAN3_02025. All annotation information is documented there. 
###Gene_Info_Comments GLEAN3_09952 ###
 The nucleotides of the coding and 3'UTR sequence have 99% identity to those of GLEAN3_10695.
This is a member of sea urchin-specific Tlr Group IA.
###Gene_Info_Comments GLEAN3_10320 ###
This gene model may represent a pseudogene or contain a sequence error. 3'UTR sequence matches coding sequence of other Sp-Tlr genes and contains stop codons. Modified gene model has some frame shifts, but reflects best gene structure.
###Gene_Info_Comments GLEAN3_10680 ###
This gene model may represent a pseudogene or contain a sequence error. 5'UTR sequence matches coding sequence of other Sp-Tlr genes and contains stop codons. Modified gene model has some frame shifts, but reflects best gene model.
###Gene_Info_Comments GLEAN3_10693 ###
Partial Toll-like receptor. This gene model is located at the end of a short scaffold. The nucleotide sequence has 96% identity to another Sp-Tlr gene, so it could be a member of Toll-like receptor.
This is a member of sea urchin-specific Tlr Group I(orphan).
###Gene_Info_Comments GLEAN3_11277 ###
Partial Toll-like receptor. This gene model is located at the end of a short scaffold. The nucleotide sequence has 93% identity to another Sp-Tlr gene, so it could be a member of Toll-like receptor.
This is a member of sea urchin-specific Tlr Group IA.
###Gene_Info_Comments GLEAN3_05983 ###
This gene was annotated based on a manual inspection of protein alignments.

Another glean model (GLEAN3_12845) codes for a very similar sequence (92% identity). It is yet to be determined to which extent this corresponds to a true gene duplication or a problem with the assembly.

Several Sp-IL17 genes were identified computationally. Each of them shows a best Blast-p hit to specific vertebrate IL17 family members. However, multiple sequence alignments demonstrated that almost all the Sp-IL17 sequences segregate from their vertebrate counterparts as a separate cluster. For consistency purposes, therefore, we decided to arbitrarily assign numerical identifiers to all the Sp-Il17 genes, and thus avoid the implication that they all have specific vertebrate orthologs (which are classified by the letters A-F).
###Gene_Info_Comments GLEAN3_19350 ###
This gene was annotated based on a manual inspection of protein alignments.

Two other adjacent models (GLEAN3_19349 and GLEAN3_19351) code for very similar sequences. It is yet to be determined to which extent this reflects true gene duplications or assembly problems.

Several Sp-IL17 genes were identified computationally. Each of them shows a best Blast-p hit to specific vertebrate IL17 family members. However, multiple sequence alignments demonstrated that almost all the Sp-IL17 sequences segregate from their vertebrate counterparts as a separate cluster. For consistency purposes, therefore, we decided to arbitrarily assign numerical identifiers to all the Sp-Il17 genes, and thus avoid the implication that they all have specific vertebrate orthologs (which are classified by the letters A-F).
###Gene_Info_Comments GLEAN3_19351 ###
This gene was annotated based on a manual inspection of protein alignments.

Two other adjacent models (GLEAN3_19349 and GLEAN3_19350) code for very similar sequences. It is yet to be determined to which extent this reflects true gene duplications or assembly problems.

Several Sp-IL17 genes were identified computationally. Each of them shows a best Blast-p hit to specific vertebrate IL17 family members. However, multiple sequence alignments demonstrated that almost all the Sp-IL17 sequences segregate from their vertebrate counterparts as a separate cluster. For consistency purposes, therefore, we decided to arbitrarily assign numerical identifiers to all the Sp-Il17 genes, and thus avoid the implication that they all have specific vertebrate orthologs (which are classified by the letters A-F).
###Gene_Info_Comments GLEAN3_11328 ###
This gene model may represent a pseudogene or contain a sequence error. Intron and 3'UTR sequence matches coding sequence of other Sp-Tlr genes and contains stop codons. Modified model contains some frame shifts, but reflects best gene structure.

###Gene_Info_Comments GLEAN3_12844 ###
This gene was annotated based on a manual inspection of protein alignments.

This prediction contains a sequence that seems a triplication of the sequence from an adjacent model (GLEAN3_12845). It is yet to be determined if this reflects a true exon multiplication/gene duplication or problems with the assembly. In addition, another model (GLEAN3_05983) codes for a very similar sequence (92% identical). Again, it is yet to be determined whether this reflects a true gene duplication or assembly problems.

Several Sp-IL17 genes were identified computationally. Each of them shows a best Blast-p hit to specific vertebrate IL17 family members. However, multiple sequence alignments demonstrated that almost all the Sp-IL17 sequences segregate from their vertebrate counterparts as a separate cluster. For consistency purposes, therefore, we decided to arbitrarily assign numerical identifiers to all the Sp-Il17 genes, and thus avoid the implication that they all have specific vertebrate orthologs (which are classified by the letters A-F).
###Gene_Info_Comments GLEAN3_22838 ###
This gene was annotated based on a manual inspection of sequence alignments.

There seems to be an annotation problem with this gene: the first two exons, located in one contig, are highly similar in sequence to the last three exons, which are located in a separate contig of the same scaffold. Since no other models map between these contigs, and since they lie at the end of the scaffold, this may represent a case of exon amplification, or it may be a case of erroneous assembly (haplotypes?) that led to a duplicated sequence within this model. We have not modified the present model because we have no independent evidence to support either claim.

Several Sp-IL17 genes were identified computationally. Each of them shows a best Blast-p hit to specific vertebrate IL17 family members. However, multiple sequence alignments demonstrated that almost all the Sp-IL17 sequences segregate from their vertebrate counterparts as a separate cluster. For consistency purposes, therefore, we decided to arbitrarily assign numerical identifiers to all the Sp-Il17 genes, and thus avoid the implication that they all have specific vertebrate orthologs (which are classified by the letters A-F).
###Gene_Info_Comments GLEAN3_12845 ###
This gene was annotated based on a manual inspection of protein alignments.

An adjacent prediction (GLEAN3_12844) contains a sequence that seems a triplication of this model. It is yet to be determined if this reflects a true exon amplification/gene duplication case or problems with the assembly. In addition, another model (GLEAN3_05983) codes for a very similar sequence (92% identical). Again, it is yet to be determined whether this reflects a true gene duplication or assembly problems.

Several Sp-IL17 genes were identified computationally. Each of them shows a best Blast-p hit to specific vertebrate IL17 family members. However, multiple sequence alignments demonstrated that almost all the Sp-IL17 sequences segregate from their vertebrate counterparts as a separate cluster. For consistency purposes, therefore, we decided to arbitrarily assign numerical identifiers to all the Sp-Il17 genes, and thus avoid the implication that they all have specific vertebrate orthologs (which are classified by the letters A-F).
###Gene_Info_Comments GLEAN3_11481 ###
Partial Toll-like receptor. This gene model is located at the end of a short scaffold. The nucleotide sequence has 95% identity to another Sp-Tlr gene, so it could be a member of Toll-like receptor.

###Gene_Info_Comments GLEAN3_27904 ###
This gene was annotated based on a manual inspection of protein alignments.

Several Sp-IL17 genes were identified computationally. Each of them shows a best Blast-p hit to specific vertebrate IL17 family members. However, multiple sequence alignments demonstrated that almost all the Sp-IL17 sequences segregate from their vertebrate counterparts as a separate cluster, except for this model that co-distributed with vertebrate IL17E (or IL25). For consistency purposes, however, we decided to arbitrarily assign numerical identifiers to all the Sp-Il17 genes, and thus avoid the implication that they all have specific vertebrate orthologs (which are classified by the letters A-F). At this point, a more careful analysis is needed to determine whether Sp-Il17-8 is indeed an ortholog of IL17E.
###Gene_Info_Comments GLEAN3_11541 ###
Unknown sequence (NNN...) in the intron of the current model could make this gene model incomplete. 2nd exon was eliminated based on BLASTN search.

###Gene_Info_Comments GLEAN3_13111 ###
Partial Toll-like receptor. This gene model is located at the end of a short scaffold. The nucleotide sequence has 96% identity to another Sp-Tlr gene, so it could be a member of Toll-like receptor.

###Gene_Info_Comments GLEAN3_13162 ###
Partial Toll-like receptor. This gene model is located at the end of a short scaffold. The nucleotide sequence has 97% identity to another Sp-Tlr gene, so it could be a member of Toll-like receptor.

###Gene_Info_Comments GLEAN3_14352 ###
This gene model may represent a pseudogene or contain a sequence error. 3'UTR sequence matches coding sequence of other Sp-Tlr genes and contains stop codons. Modified model contains some frame shifts, but reflects best gene structure.
###Gene_Info_Comments GLEAN3_14548 ###
This gene model may represent a pseudogene or contain a sequence error. 3'UTR sequence matches coding sequence of other Sp-Tlr genes and contains stop codons. Modified model contains some frame shifts, but reflects best gene structure.

###Gene_Info_Comments GLEAN3_14929 ###
This gene model may represent a pseudogene or contain a sequence error. Intron sequence matches coding sequence of other Sp-Tlr genes and contains stop codons. Modified gene model contains some frame shifts, but reflects best gene structure.

###Gene_Info_Comments GLEAN3_15185 ###
Partial Toll-like receptor. This gene model is located at the end of a short scaffold. The nucleotide sequence has 93% identity to another Sp-Tlr gene, so it could be a member of Toll-like receptor.

###Gene_Info_Comments GLEAN3_15553 ###
Partial Toll-like receptor. This gene model is located at the end of a short scaffold. The nucleotide sequence has 96% identity to another Sp-Tlr gene, so it could be a member of Toll-like receptor.
This is a member of sea urchin-specific Tlr Group IB. 

###Gene_Info_Comments GLEAN3_16388 ###
Unknown sequence (NNN...) in the intron of the current model could make this gene model incomplete. Intron sequence except NNN... matches a coding region of other Sp-Tlr genes. Modified gene model contains some frame shifts, but reflects best gene structure.
This is a member of sea urchin-specific Tlr Group ID.
###Gene_Info_Comments GLEAN3_11837 ###
RNA editing function, see also Glean3_11875, for identical gene (duplication? or redundant scaffold?) 
###Gene_Info_Comments GLEAN3_22815 ###
similar to G-protein coupled receptor 64 precursor 
            (Epididymis-specific protein 6) (He6 receptor)
###Gene_Info_Comments GLEAN3_02587 ###
S.purpuratus elongation factor 1B gamma cDNA cloned (AJ973179)

###Gene_Info_Comments GLEAN3_15867 ###
S.purpuratus EF1B alpha cloned (AJ973180)

###Gene_Info_Comments GLEAN3_15285 ###
Part of this sequence is also contained in GLEAN3_26576. 
"Additional evidences of the existence of the gene" have been obtained in Sphaerechinus granularis
###Gene_Info_Comments GLEAN3_16438 ###
Partial Toll-like receptor. This gene model is located at the end of a short scaffold. The nucleotide sequence has 98% identity to another Sp-Tlr gene, so it could be a member of Toll-like receptor.

###Gene_Info_Comments GLEAN3_16554 ###
Partial Toll-like receptor. This gene model is located at the end of a short scaffold. The nucleotide sequence has 97% identity to another Sp-Tlr gene, so it could be a member of Toll-like receptor.
This is a member of sea urchin-specific Tlr Group IC.
###Gene_Info_Comments GLEAN3_17735 ###
Partial Toll-like receptor. This gene model is located at the end of a short scaffold. The nucleotide sequence has 96% identity to another Sp-Tlr gene, so it could be a member of Toll-like receptor.
###Gene_Info_Comments GLEAN3_17794 ###
Partial Toll-like receptor. This gene model is located at the end of a short scaffold. The nucleotide sequence has 95% identity to another Sp-Tlr gene, so it could be a member of Toll-like receptor.
###Gene_Info_Comments GLEAN3_18380 ###
This gene model may represent a pseudogene or contain a sequence error. 5'UTR sequence matches coding sequence of other Sp-Tlr genes and contains stop codons. Modified model contains some frame shift, but reflects best gene structure.
This is a member of sea urchin-specific Tlr Group ID.

###Gene_Info_Comments GLEAN3_20428 ###
Partial Toll-like receptor. This gene model is located at the end of a contig. The nucleotide sequence has 94% identity to another Sp-Tlr gene, so it could be a member of Toll-like receptor.

###Gene_Info_Comments GLEAN3_20644 ###
Unknown sequence (NNN...) in the intron of the current model could make this gene model incomplete. But the second exon was eliminated based on BLASTN search.
This is a member of sea urchin-specific Tlr Group IB. 
###Gene_Info_Comments GLEAN3_11299 ###
This gene was annotated based on a manual inspection of multiple protein alignments.

Several gene models related to this one were found. A multiple sequence alignment with various vertebrate and nematode related proteins failed to strongly support any homologies among them, and we have therefore arbitrarily numbered the members of this S.purpuratus family of genes.
###Gene_Info_Comments GLEAN3_20652 ###
Unknown sequence (NNN...) in the 3'and 5' UTR of the current model could make modified gene model still incomplete. 
This is a member of sea urchin specific Tlr Group IA.
###Gene_Info_Comments GLEAN3_17901 ###
This gene was annotated based on a manual inspection of multiple protein alignments.

This gene model was modified based on an overlapping FgeneshAB prediction which extends the protein sequence of this model and improves its alignment with related sequences from other phyla. Nonetheless, this model seems still incomplete after this modification (N-ter sequence missing). The model is situated in a small scaffold, and that likely accounts for the missing information.

Several gene models related to this one were found. A multiple sequence alignment with various vertebrate and nematode related proteins failed to strongly support any homologies among them, and we have therefore arbitrarily numbered the members of this S.purpuratus family of genes.
###Gene_Info_Comments GLEAN3_20124 ###
Sp-Elf has two splice variants differing in the 5' region:
Sp-Elf A       GLEAN3_20124
Sp-Elf B       GLEAN3_20123

###Gene_Info_Comments GLEAN3_12071 ###
This gene was annotated based on a manual inspection of multiple protein alignments.

The best Blast hit for this gene corresponds to a nematode MIF-like protein, and slightly worse hits correspond to vertebrate dopachrome tautomerase genes, which are closely related to MIFs. For this reason, we have arbitrarily named this gene Sp-Mif-like1.
###Gene_Info_Comments GLEAN3_19323 ###
This gene was annotated based on a manual inspection of multiple protein alignments.

The best Blast hit for this gene corresponds to human dopachrome tautomerase (a.k.a. phenylpyruvate tautomerase II), a gene very closely related to MIF (phenylpyruvate tautomerase). For this reason, we have arbitrarily named this gene Sp-Mif-like2.

Note that the genome-wide tiling array data correlate with the exon structure of this model, but that they also indicate high expression levels of a genomic region that falls in the second intron of this model. No other models cover the region, and it is unclear at this point what might account for these observations.
###Gene_Info_Comments GLEAN3_20035 ###
This gene was annotated based on a manual inspection of multiple protein alignments.

The genome-wide tiling array data correlate with the exon structure of this model.

Several gene models related to this one were found. A multiple sequence alignment with various vertebrate and nematode related proteins failed to strongly support any homologies among them, and we have therefore arbitrarily numbered the members of this S.purpuratus family of genes.
###Gene_Info_Comments GLEAN3_20036 ###
This gene was annotated based on a manual inspection of multiple protein alignments.

Note that the genome-wide tiling array data correlate with the exon structure of this model, but that they also indicate high expression levels of a genomic region that falls in only intron of this model. This region, however, is included in a Genescan model (Supertig1576_6) on the opposite strand, suggesting it may reflect the expression of an overlapping gene. This is yet to be determined experimentally.

Several gene models related to this one were found. A multiple sequence alignment with various vertebrate and nematode related proteins failed to strongly support any homologies among them, and we have therefore arbitrarily numbered the members of this S.purpuratus family of genes.
###Gene_Info_Comments GLEAN3_16226 ###
This gene was annotated based on a manual inspection of multiple protein alignments.

Note that the genome-wide tiling array data correlate with the exon structure of this model, but that they also indicate expression of a genomic region that falls in the second intron of this model. No other models cover the region, and it is unclear at this point what might explain this observation.

Several gene models related to this one were found. A multiple sequence alignment with various vertebrate and nematode related proteins failed to strongly support any homologies among them, and we have therefore arbitrarily numbered the members of this S.purpuratus family of genes.
###Gene_Info_Comments GLEAN3_01152 ###
This gene was annotated based on a manual inspection of multiple protein alignments.

The genome-wide tiling array data correlate with the exon structure of this model. Note, however, that there are overlapping Fgenesh++/AB predictions that incorporate more C-ter sequence than this glean model, which is also supported by the genome-wide tiling array data. Since we have no experimental evidence to favor either model, we have accepted the glean model in its present form.

Several gene models related to this one were found. A multiple sequence alignment with various vertebrate and nematode related proteins failed to strongly support any homologies among them, and we have therefore arbitrarily numbered the members of this S.purpuratus family of genes.
###Gene_Info_Comments Mif-2 ###
This gene was created based on an NCBI prediction identified based on a manual inspection of multiple protein alignments. The NCBI prediction was accepted with no modifications. However, notice that this model is incomplete (CDS has no stop codon), which is likely due to the fact that it is located in a small scaffold.

Several gene models related to this one were found. A multiple sequence alignment with various vertebrate and nematode related proteins failed to strongly support any homologies among them, and we have therefore arbitrarily numbered the members of this S.purpuratus family of genes.
###Gene_Info_Comments GLEAN3_19408 ###
This model was annotated based on a manual inspection of multiple protein sequence alignments.
###Gene_Info_Comments GLEAN3_22057 ###
This prediction lacks 5'-terminus.  5'-terminus sequence is in GLEAN3_10703 (scaffold 1972). 
###Gene_Info_Comments GLEAN3_10703 ###
This prediction includes Semaphorin 5'-terminus domain (predicted exon 1-5) and UDP-glucuronosyltransferase (predicted exon 6).  See GLEAN3_22057 (scaffold 181) for the other part of Semaphorin sequence.

This prediction maps to Scaffoldi5629 and a hand editted genewise prediction identifies a gene with the architecture  NH2, SEMA, PSI, TM, COOH.  This sequence is a class 6 Sema and the peptide sequence in Gene Sequences has been updated.(RDB 3 May06)
###Gene_Info_Comments GLEAN3_21931 ###
GLEAN3_16610 (scaffold 50410) is also likely ortholog of AADC, but the prediction is missing 40 aa that is consistent with predicted exon 3 of GLEAN3_21931.  
###Gene_Info_Comments GLEAN3_16610 ###
This prediction is missing 40 aa.  GLEAN3_21931 is also likely ortholog of AADC.  
###Gene_Info_Comments GLEAN3_03528 ###
Part of this sequence is also contained in GLEAN3_11191. Exon 3 and 4 are missing in GLEAN3_11191.
"Additional evidences of the existence of the gene" have been obtained in Sphaerechinus granularis
###Gene_Info_Comments GLEAN3_06778 ###
This gene is on two scaffolds (59529 and 1777). On scaffold59529, one GLEAN model is predicted for this gene (GLEAN3_06778 for exon 1-5). On scafold 1777, there is one GLEAN model (GLEAN3_08535 for exon 3-24) prediceted for this gene.
Please refer to GLEAN_08535 for refined gene features.
###Gene_Info_Comments GLEAN3_08535 ###
This gene is on two scaffolds (59529 and 1777). On scaffold59529, one GLEAN model is predicted for this gene (GLEAN3_06778 for exon 1-5). On scafold 1777, there is one GLEAN model (GLEAN3_08535 for exon 3-24) prediceted for this gene.

###Gene_Info_Comments GLEAN3_15654 ###
One more exon was accepted as coding region by comparison to the corresponding FgenesAB and ++ prediction.
###Gene_Info_Comments GLEAN3_07240 ###
Alignment with best blast sequence suggests that the model is good.
###Gene_Info_Comments GLEAN3_19579 ###
Blast alignments suggest that the model is good.
###Gene_Info_Comments GLEAN3_19610 ###
Blast alignments suggest that the model is good.

There are two excellent matches defined  by Genscan that are not on the glean3 list.  The first on scaffold 2001_1, 1128-31566, is likely to be a haplotype sequence.  The second is on scaffold 102441_1, 594-10744.
###Gene_Info_Comments GLEAN3_22407 ###
partial CDS; 

There are two excellent matches defined by Genscan that are not on the glean3 list.  Scaffold2--1_1, 1128-31455 and Scaffold102411_1, 591-10744
###Gene_Info_Comments GLEAN3_02939 ###
Long sequence presenting homologie with Neurofilament Heavy Polypeptide in N-term and acin1 in C-terminal
For SpAcin1 see duplication gene Glean_02578
###Gene_Info_Comments GLEAN3_10966 ###
Blast alignment suggests that the model is good.
###Gene_Info_Comments GLEAN3_07208 ###
THREE COMMENTS:

1) Alignment with best blast sequence suggests that N-terminal sequences are missing in the model.  
2) They also suggest that an internal exon is missing between the following two exons.
>GLEAN3_07208|Scaffold38521|27220|27355| DNA_SRC: Scaffold38521 START: 27220 STOP: 27355 STRAND: + 
TGAGCTGAGTAAAAGCATTCATCCCTCCTCCTCTCATTGCTGAGCGTATGATGTAGAAGAGAAAACCCAG
AACAAGGACTGTTCCCAGGATACTCATCATTCCATCAGAGCTAGAATGCTGGTAACGGATCTGAAT
>GLEAN3_07208|Scaffold38521|28790|28906| DNA_SRC: Scaffold38521 START: 28790 STOP: 28906 STRAND: + 
CTCTCTACCAAAGACGATAGCACCATCATGCAGGAAGACATGGACTACATCAGACTCAGGATTCTTGCTC
TCTCCACTGCCATCAGTATCATAATACGTTACTGACACATCCTTCAC
3) There are matches to the model sequence on scaffolds 115226_1, 13 to 446 and 118973_1, 377 to 537.


###Gene_Info_Comments GLEAN3_25264 ###
Alignment to best blast sequence suggests that the model is missing an internal exon, N-terminal sequence and C-terminal sequence.  See notes for GLEAN3-07208, which probably is the same gene, concerning the internal exon.


###Gene_Info_Comments GLEAN3_07374 ###
Blast alignments suggest that the model is good.  Sequence similarity with Hs sequence is 89% at amino acid level!
###Gene_Info_Comments GLEAN3_02234 ###
Alignment with best blast sequence suggests that model is good.  88% amino acid identity over 389 residues!!
###Gene_Info_Comments GLEAN3_16198 ###
Alignment with best blast sequence suggests that the N-terminal sequence in the model is not conserved although the remainder of the protein is 90% identical.  The N-terminal sequence cannot be confirmed by blast alignment.
###Gene_Info_Comments GLEAN3_28593 ###
Alignment with best blast sequence suggests that the N-terminal sequence in the model is not conserved, altough the remainder of the protein is 90% identical.  Cannot confirm that the N-terminal sequence is correct.
###Gene_Info_Comments GLEAN3_18934 ###
Alignment with best blast sequence suggests assembly problems with this model.  N-terminal half sequences in the model match C-terminal half sequences in the best blast hit and C-terminal half in the model match N-terminal sequences in the best blast hit.  The N-terminal sequence in the model is not conserved and cannot be verified.
###Gene_Info_Comments GLEAN3_19415 ###
Alignment with best blast sequence suggests that an exon is missing between the following predicted exons:
>GLEAN3_19415|Scaffold469|23524|23696| DNA_SRC: Scaffold469 START: 23524 STOP: 23696 STRAND: + 
GCTAAGATGGATGAGCTTCAGCTCTTCCGTGGAGACACAGTCATGCTCAAAGGCAAGAAAAGGCGAGACA
CCGTCTGCATTGTACTCTCAGATGACACCGTAACAGATGACAAGATTCGTGTCAACCGAGTTGTCAGGAG
TAATCTTCGCGTTCGTCTAGGAGACATTGTCAG
>GLEAN3_19415|Scaffold469|25849|26041| DNA_SRC: Scaffold469 START: 25849 STOP: 26041 STRAND: + 
AAACCTCTTTGATGTATACCTGAGGCCGTACTTCCAGGAGGCGTACCGCCCCGTCAGGAAAGGTGACATC
TTTCAAATCCGTGGAGGCATGAGGGCGGTAGAATTCAAAGTGGTGGAAACAGACCCCGGACCATACTGCA
TCGTTTCACCTGATACAGTCATACACTTTGAGGGAGATGCAATCAAGCGAGAG

###Gene_Info_Comments GLEAN3_22919 ###
Alignment with best blast sequence suggests that the only a small segment (~10%) of the model is conserved with the best blast hit.

Alignment of the Genscan model shows a much longer alignment suggesting that GLEAN3_22918, 22919, 22920 and 22921 should be combined in one model.  However two exons in the Genscan model are not conserved and cannot be confirmed.  They are:
>Supertig39397_1|Scaffold39397|21179|21262| DNA_SRC: Scaffold39397 START: 21179 STOP: 21262 STRAND: + 
GAGTGCATCCAACAGCTGACGTCAGATGATGCGTGGTATCCGTGCGGAGAAGGGCGCGAAAATTCAACAG
ATTATTGGCTGAAG
>Supertig39397_1|Scaffold39397|26430|26602| DNA_SRC: Scaffold39397 START: 26430 STOP: 26602 STRAND: + 
GGGCTATCTGCAAGGCTAAATGTCGTTTGGGTCGGTCATACCAGATTGCCCTGGAGGGTGCTACCTACAG
AGAGCGGGGGCTGCTGGGAAGGCATGAGGTCATCCTGTGCACGGCCGGTGGTGTCTGGTACCCCAACCTC
GACCAGATAGTATGTCATGAAAAATGCTTGGAG
###Gene_Info_Comments GLEAN3_18513 ###
Alignment with best blast sequence verifies that all but the N-terminal sequence is conserved.   This is another haplotype copy of the combination GLEAN3_22918, 22919, 22920 and 22921 (see Genscan 61252_1 or annotation notes for GLEAN3_22919)
###Gene_Info_Comments GLEAN3_12016 ###
Alignment with best blast sequence shows that only part of the gene is present on this scaffold, 1962.
###Gene_Info_Comments GLEAN3_06212 ###
Alignment with best blast sequence shows that model does not include N-terminal sequence.
###Gene_Info_Comments GLEAN3_13070 ###
Alignment with best blast sequence suggest that the N-terminal exon may not be part of the protein; it is not conserved and the remainder of the model matches the entire length of the best blast hit.  The questionable exon is:
>GLEAN3_13070|Scaffold31758|4248|4352| DNA_SRC: Scaffold31758 START: 4248 STOP: 4352 STRAND: + 
TCAGGCTGCAGAAACAGCAGGAATAGACTTTGTGATGACAGCTCTTTGAGAAGGGATTCTGTGAGGTAGC
TCAGGGCAGTTGCAGTACTGTCTAACCTGGATGGC
###Gene_Info_Comments GLEAN3_01178 ###
Alignment with best blast sequences shows that that all but a short N-terminal region the model is conserved; an internal exon consisting of a string of serines may be missing.
###Gene_Info_Comments GLEAN3_16511 ###
Alignment with best blast sequence suggests that the model may lack N- and C-terminal sequence; conserved sequences are located in the central region of the best blast hits.
###Gene_Info_Comments GLEAN3_10215 ###
Alignment with best blast sequence suggests that the model lacks conserved N-terminal sequence.
###Gene_Info_Comments GLEAN3_16327 ###
Alignment with best blast sequence suggests that the model is nearly complete, but lacks an internal exon between the following predicted exons:
>GLEAN3_16327|Scaffold102457|14501|14625| DNA_SRC: Scaffold102457 START: 14501 STOP: 14625 STRAND: + 
TTGAGAGCCTCTCTTGAGAGAGACAGGTAGATTCCTACACACTCTGCTCTACATTCTTCATAAGGTGAGG
CAATCACAGGAAACTTGGAATCCCAAACCTCTCCTGGCATGTACCATGATGAGAT
>GLEAN3_16327|Scaffold102457|18405|18511| DNA_SRC: Scaffold102457 START: 18405 STOP: 18511 STRAND: + 
CTTGTCTGTATCTGTCAAGAAGGTGATCGTCTGATCTTTACCGTAAGCAGACAGCACATTGCCTAGAGAT
ACATTCTTAAATCCTTCATCCTGACGAATGTCATCAT
###Gene_Info_Comments GLEAN3_28721 ###
TWO COMMENTS:
1) partial CDS; lacking C-terminal half and possibly short N-terminal region.
2) There is a match to a short segment of this model on Scaffold40239_1, 687-798, that is not on the glean3 list.
###Gene_Info_Comments GLEAN3_04886 ###
partial CDS; lacking both N- and C-terminal sequences, probably because this model is on a short scaffold.
###Gene_Info_Comments GLEAN3_26005 ###
Alignment with best blast sequence suggest that the model is good.
  
###Gene_Info_Comments GLEAN3_15187 ###
TWO COMMENTS:

1) Alignment of best blast sequence suggests that the model is good.
2) There is an excellent match on Scaffold131610, 127-321 that is not on the glean 3 list.
###Gene_Info_Comments GLEAN3_00976 ###
partial CDS, lacking C-terminal half
###Gene_Info_Comments GLEAN3_23965 ###
TWO COMMENTS:

1) Alignment with top blast sequences suggusts the model is good.
2) There is an excellent match to a portion of this model on Scaffold17468_1, 217-384.
###Gene_Info_Comments GLEAN3_23686 ###
Alignment with best blast data and blasts with individual exons strongly suggest that the only exons in the model that should be included are:
>GLEAN3_23686|Scaffold1004|11066|11134| DNA_SRC: Scaffold1004 START: 11066 STOP: 11134 STRAND: + 
ATGGCGGATGAGCGCTACGTCATTGACGTTCTGGTTTGTTGTTGTCAAGAGATAAGGGTAGGGTTGCAT
>GLEAN3_23686|Scaffold1004|143554|143633| DNA_SRC: Scaffold1004 START: 143554 STOP: 143633 STRAND: + 
GTCGCAGTGGACATGGAGTTTGCCAAGAACATGTTTGAGTTACATAAAAAGGTGAACTCATGGGAGAACA
TTGTAGGATG
>GLEAN3_23686|Scaffold1004|143857|143994| DNA_SRC: Scaffold1004 START: 143857 STOP: 143994 STRAND: + 
GTATGCAACAGGACGTGACATCACAGGTCATTCAGTGCTGATACACGACTACTACTCAAGAGAGTGTCAA
AACCCGATTCACGTCACGGTCGATACAACGATGGTAGACCTCAATATGTCAGTCAAGACATGGGTTAG
>GLEAN3_23686|Scaffold1004|144623|144714| DNA_SRC: Scaffold1004 START: 144623 STOP: 144714 STRAND: + 
GCAAAATATGGGCGTACCAGACAAGTCACAAGGCACCGTGTTCATTCCAATTCCCATGAAAATCTCCTTC
CATCAACCAGAGAAAGTAGCAA
>GLEAN3_23686|Scaffold1004|145417|145553| DNA_SRC: Scaffold1004 START: 145417 STOP: 145553 STRAND: + 
TGGATGCGCTTATAAGGGAGACAGAACCAAACAGAAAAACCATTGAGTTGACGACCGATCTTCAGTACGT
GTCTAAAGCCTCTGGTAAACTTCAAGAGATGTTGACAAGAGTGCTCCAGTATGTTGATGATATCCTG
>GLEAN3_23686|Scaffold1004|146767|146880| DNA_SRC: Scaffold1004 START: 146767 STOP: 146880 STRAND: + 
AGTGGAAAGATTCAAGCCGATAACCAGATTGGCCGGTTTCTGATGAATCTAGTTTCCAATGTTCCTAAGT
TGCAGCCTGATGAGTTTGATGAGATGCTCAATAACAGTATGAAG
>GLEAN3_23686|Scaffold1004|147401|147496| DNA_SRC: Scaffold1004 START: 147401 STOP: 147496 STRAND: + 
GATCTTCTGATGGTAGTCTACCTGTCCGGTCTGATTGAGACCCAGCTCACTCTCAACGAAAAGCTGACGT
TATCCAAAGCAGCTAATGCAGTTGCA
Others blast to either different proteins or nothing.
###Gene_Info_Comments GLEAN3_02210 ###
"Additional evidences of the existence of the gene" have been obtained in Sphaerechinus granularis
###Gene_Info_Comments GLEAN3_12936 ###
SubgroupB thrombospondin. Large gap in the middle of the gene.
###Gene_Info_Comments Sp-ACE I-like ###
Genes or parts of genes predicted by Genscan that are not in the Glean3 list.
###Gene_Info_Comments Sp-TAF1-like ###
Genes or parts of genes predicted by one or more of the prediction programs (Genscan, Gnomen, fgenesh) that are not in the Glean3 list.


###Gene_Info_Comments Sp-leishmanolysin-like ###
Genes or parts of genes predicted by one or more of the prediction programs (Genscan, Gnomen, fgenesh) that are not in the Glean3 list.
###Gene_Info_Comments Sp-TAF2-like ###
Genes or parts of genes predicted by one or more of the prediction programs (Genscan, Gnomen, fgenesh) that are not in the Glean3 list.


###Gene_Info_Comments GLEAN3_13106 ###
This may be 2 concatenated genes because of the (TSP3)n-TSP_C-LamGL domain architecture. It's also likely to be a partial gene model missing the N-terminus due to end of contig.
###Gene_Info_Comments Sp-MT4-mmp-like ###
Genes or parts of genes predicted by one or more of the prediction programs (Genscan, Gnomen, fgenesh) that are not in the Glean3 list.
###Gene_Info_Comments Sp-mmp2-like ###
Genes or parts of genes predicted by one or more of the prediction programs (Genscan, Gnomen, fgenesh) that are not in the Glean3 list.
###Gene_Info_Comments Sp-mmp27-like ###
Genes or parts of genes predicted by one or more of the prediction programs (Genscan, Gnomen, fgenesh) that are not in the Glean3 list.
###Gene_Info_Comments Sp-matrix metallopeptidase2-like ###
Genes or parts of genes predicted by one or more of the prediction programs (Genscan, Gnomen, fgenesh) that are not in the Glean3 list.
###Gene_Info_Comments Sp-matrix metallopeptidase3-like ###
Genes or parts of genes predicted by one or more of the prediction programs (Genscan, Gnomen, fgenesh) that are not in the Glean3 list.
###Gene_Info_Comments Sp-matrix metalloproteinase2-like ###
Genes or parts of genes predicted by one or more of the prediction programs (Genscan, Gnomen, fgenesh) that are not in the Glean3 list.
###Gene_Info_Comments Sp-matrix metalloproteinase3/14-like ###
Genes or parts of genes predicted by one or more of the prediction programs (Genscan, Gnomen, fgenesh) that are not in the Glean3 list.
###Gene_Info_Comments GLEAN3_17370 ###
This gene encodes a protein with an unusual domain architecture. The N-terminus contains:
Pfam:F5_F8_type_C-Pfam:GCC2_GCC3 - which occurs in metazoa

the C-terminus looks like a subgroupB thrombospondin. Therefore, it might be a concatenation of 2 genes or a novel urchin architecture. 
There is no est evidence or anything else to justify spliting it into 2. The NCBI gene model may be more accurate than the GLEAN3 model.
###Gene_Info_Comments Sp-meprinA-like ###
Genes or parts of genes predicted by one or more of the prediction programs (Genscan, Gnomen, fgenesh) that are not in the Glean3 list.
###Gene_Info_Comments Sp-TLL1-like ###
Genes or parts of genes predicted by one or more of the prediction programs (Genscan, Gnomen, fgenesh) that are not in the Glean3 list.
###Gene_Info_Comments GLEAN3_22667 ###
N-terminus may be truncated due to end of contig
###Gene_Info_Comments Sp-tolloid2-like ###
Genes or parts of genes predicted by one or more of the prediction programs (Genscan, Gnomen, fgenesh) that are not in the Glean3 list.
###Gene_Info_Comments Sp-tolloid1-like ###
Genes or parts of genes predicted by one or more of the prediction programs (Genscan, Gnomen, fgenesh) that are not in the Glean3 list.
###Gene_Info_Comments Sp-meprinA, beta-like ###
Genes or parts of genes predicted by one or more of the prediction programs (Genscan, Gnomen, fgenesh) that are not in the Glean3 list.
###Gene_Info_Comments Sp-ADAMTS6-like ###
Genes or parts of genes predicted by one or more of the prediction programs (Genscan, Gnomen, fgenesh) that are not in the Glean3 list.
###Gene_Info_Comments Sp-ADAMTS6 metalloprotease-like ###
Genes or parts of genes predicted by one or more of the prediction programs (Genscan, Gnomen, fgenesh) that are not in the Glean3 list.


###Gene_Info_Comments Sp-metagidin-like ###
Genes or parts of genes predicted by one or more of the prediction programs (Genscan, Gnomen, fgenesh) that are not in the Glean3 list.
###Gene_Info_Comments Sp-ADAM10-like ###
Genes or parts of genes predicted by one or more of the prediction programs (Genscan, Gnomen, fgenesh) that are not in the Glean3 list.
###Gene_Info_Comments Sp-ADAMTS1-like ###
Genes or parts of genes predicted by one or more of the prediction programs (Genscan, Gnomen, fgenesh) that are not in the Glean3 list.
###Gene_Info_Comments Sp-ADAM10 metallopeptidase-like ###
Genes or parts of genes predicted by one or more of the prediction programs (Genscan, Gnomen, fgenesh) that are not in the Glean3 list.
###Gene_Info_Comments Sp-ADAMTS1 metallopeptidase-like ###
Genes or parts of genes predicted by one or more of the prediction programs (Genscan, Gnomen, fgenesh) that are not in the Glean3 list.


###Gene_Info_Comments Sp-ADAM12-like ###
Genes or parts of genes predicted by one or more of the prediction programs (Genscan, Gnomen, fgenesh) that are not in the Glean3 list.
###Gene_Info_Comments Sp-ADAMTS1 metalloprotease-like ###
Genes or parts of genes predicted by one or more of the prediction programs (Genscan, Gnomen, fgenesh) that are not in the Glean3 list.
###Gene_Info_Comments Sp-endothelin converting enzyme1-like ###
Genes or parts of genes predicted by one or more of the prediction programs (Genscan, Gnomen, fgenesh) that are not in the Glean3 list.
###Gene_Info_Comments Sp-ECE1-like ###
Genes or parts of genes predicted by one or more of the prediction programs (Genscan, Gnomen, fgenesh) that are not in the Glean3 list.
###Gene_Info_Comments Sp-carboxypeptiaseD-like ###
Genes or parts of genes predicted by one or more of the prediction programs (Genscan, Gnomen, fgenesh) that are not in the Glean3 list.
###Gene_Info_Comments Sp-aminopeptidase1-like ###
Genes or parts of genes predicted by one or more of the prediction programs (Genscan, Gnomen, fgenesh) that are not in the Glean3 list.
###Gene_Info_Comments Sp-dipeptidease-like ###
Genes or parts of genes predicted by one or more of the prediction programs (Genscan, Gnomen, fgenesh) that are not in the Glean3 list.
###Gene_Info_Comments Sp-ACY1L2-like ###
Genes or parts of genes predicted by one or more of the prediction programs (Genscan, Gnomen, fgenesh) that are not in the Glean3 list.
###Gene_Info_Comments Sp-aminocyclase-like ###
Genes or parts of genes predicted by one or more of the prediction programs (Genscan, Gnomen, fgenesh) that are not in the Glean3 list.
###Gene_Info_Comments GLEAN3_00959 ###
Glean3_0059 corresponds to N-ter domain of Sp-EF1B delta.
Glean3_0059 contains three exons : 
exon1 : scaffold1943|194310|194423|strand(-);  
exon2 : scaffold1943|191630|191707|strand(-); 
exon3 : scaffold1943|189643|189735|strand(-);

Two isoforms are expressed in sea urchin (Y14235 and AJ973181). Y14235 does not contain exon2.

The C-ter domain of Sp-EF1B delta is encoded by Glean3_00960
The entire sequence is given in annotation for GLEAN3_00960


###Gene_Info_Comments GLEAN3_12281 ###
Same sequence as GLEAN3_11711 except the most N-terminal part.
Duplication most likely due to assembly process.
###Gene_Info_Comments Sp-aminocyclase1-like ###
Genes or parts of genes predicted by one or more of the prediction programs (Genscan, Gnomen, fgenesh) that are not in the Glean3 list.
###Gene_Info_Comments GLEAN3_11711 ###
Same sequence as GLEAN3_12281 except the most N-terminal part.
Duplication most likely due to assembly process.
###Gene_Info_Comments Sp-O-sialoglycoprotein endopeptidase-like ###
Genes or parts of genes predicted by one or more of the prediction programs (Genscan, Gnomen, fgenesh) that are not in the Glean3 list.
###Gene_Info_Comments GLEAN3_09593 ###
Binds the stem-loop structure of replication-dependent histone pre-mRNAs and contributes to efficient 3' end processing by stabilizing the complex between histone pre-mRNA and U7 small nuclear ribonucleoprotein (snRNP). Could play an important role in targeting mature histone mRNA from the nucleus to the cytoplasm and to the translation machinery. Stabilizes mature histone mRNA and could be involved in cell-cycle regulation of histone gene expression (By similarity). 
###Gene_Info_Comments GLEAN3_24780 ###
see anotation to GLEAN3_09593
###Gene_Info_Comments Sp-O-peptidaseD-like ###
Genes or parts of genes predicted by one or more of the prediction programs (Genscan, Gnomen, fgenesh) that are not in the Glean3 list.
###Gene_Info_Comments Sp-O-NAALAD2-like ###
Genes or parts of genes predicted by one or more of the prediction programs (Genscan, Gnomen, fgenesh) that are not in the Glean3 list.


###Gene_Info_Comments Sp-N-acetylated alpha-linked acidic dipeptidase 2-like ###
Genes or parts of genes predicted by one or more of the prediction programs (Genscan, Gnomen, fgenesh) that are not in the Glean3 list.
###Gene_Info_Comments GLEAN3_20741 ###
This gene model may represent a pseudogene or contain a sequence error. 3'UTR sequence matches coding sequence of other Sp-Tlr genes and contains stop codons. Modified model contains some frame shifts, but reflects best gene structure.
This is a member of sea urchin specific-Tlr Group ID.
###Gene_Info_Comments Sp-N-acetylated alpha-linked acidic dipeptidase 1-like ###
Genes or parts of genes predicted by one or more of the prediction programs (Genscan, Gnomen, fgenesh) that are not in the Glean3 list.
###Gene_Info_Comments Sp-NAALAD2 metalloprotease-like ###
Genes or parts of genes predicted by one or more of the prediction programs (Genscan, Gnomen, fgenesh) that are not in the Glean3 list.
###Gene_Info_Comments Sp-NAALAD2-like ###
Genes or parts of genes predicted by one or more of the prediction programs (Genscan, Gnomen, fgenesh) that are not in the Glean3 list.
###Gene_Info_Comments Sp-NAALAD2-like protease ###
Genes or parts of genes predicted by one or more of the prediction programs (Genscan, Gnomen, fgenesh) that are not in the Glean3 list.
###Gene_Info_Comments GLEAN3_21075 ###
This gene model may represent a pseudogene or contain a sequence error.5'UTR sequence matches coding sequence of other Sp-Tlr genes and contains stop codons (frame shift).
###Gene_Info_Comments Sp-aspartate transcarbamylase-like ###
Genes or parts of genes predicted by one or more of the prediction programs (Genscan, Gnomen, fgenesh) that are not in the Glean3 list.
###Gene_Info_Comments GLEAN3_24092 ###
Same sequence as GLEAN3_00814 except the most N-terminal part.
Duplication most likely due to assembly process.
###Gene_Info_Comments Sp-carbamoylphosphate synthetase-like ###
Genes or parts of genes predicted by one or more of the prediction programs (Genscan, Gnomen, fgenesh) that are not in the Glean3 list.
###Gene_Info_Comments GLEAN3_00814 ###
Same sequence as GLEAN3_24092 except the most N-terminal part.
Duplication most likely due to assembly process.
###Gene_Info_Comments Sp-carbamoylphosphate ###
Genes or parts of genes predicted by one or more of the prediction programs (Genscan, Gnomen, fgenesh) that are not in the Glean3 list.
###Gene_Info_Comments GLEAN3_21283 ###
The nucleotide sequence of this gene model has 100% identity to GLEAN3_02442. It may be caused by an assembly error.

###Gene_Info_Comments GLEAN3_20322 ###
many similarities to plant HSP-90 also; see also (Glean3_20322) looks to be the same as (Glean3_01586) and above Glean
###Gene_Info_Comments GLEAN3_21362 ###
Partial Toll-like receptor. This gene model is located at the end of a short scaffold.
###Gene_Info_Comments Sp-allantoinase-like ###
Genes or parts of genes predicted by one or more of the prediction programs (Genscan, Gnomen, fgenesh) that are not in the Glean3 list.
###Gene_Info_Comments GLEAN3_21502 ###
Partial Toll-like receptor. This gene model is located at the end of a short scaffold. The nucleotide sequence has 94% identity to another Sp-Tlr gene, so it could be a member of Toll-like receptor.
This is a member of sea urchin-specific Tlr Group IA.
###Gene_Info_Comments Sp-YME1-like ###
Genes or parts of genes predicted by one or more of the prediction programs (Genscan, Gnomen, fgenesh) that are not in the Glean3 list.
###Gene_Info_Comments Sp-paraplegin-like ###
Genes or parts of genes predicted by one or more of the prediction programs (Genscan, Gnomen, fgenesh) that are not in the Glean3 list.
###Gene_Info_Comments GLEAN3_23179 ###
Partial Toll-like receptor. This gene model is located at the end of a short scaffold. The nucleotide sequence has 94% identity to another Sp-Tlr gene, so it could be a member of Toll-like receptor.
###Gene_Info_Comments Sp-STE24p-like ###
Genes or parts of genes predicted by one or more of the prediction programs (Genscan, Gnomen, fgenesh) that are not in the Glean3 list.
###Gene_Info_Comments GLEAN3_23491 ###
Partial Toll-like receptor. This gene model is located at the end of a short scaffold. The nucleotide sequence has 99% identity to another Sp-Tlr gene, so it could be a member of Toll-like receptor.
This is a member of sea urchin-specific Tlr Group ID.
###Gene_Info_Comments GLEAN3_24501 ###
Partial Toll-like receptor. The nucleotide sequence has 95% identity to another Sp-Tlr gene, so it could be a member of Toll-like receptor.
This is a member of sea urchin-specific Tlr Group ID.
###Gene_Info_Comments GLEAN3_28424 ###
this glean prediction contains only the c-term (tyrosine kinase domain) of the protein. it seems that glean3_07624 contains the N-term (extracellular domain).
###Gene_Info_Comments GLEAN3_07624 ###
this glean prediction contains only the N-term (extracellular domain) of the protein. it seems that glean3_28424 contains the C-term (tyrosine kinase domain).
###Gene_Info_Comments GLEAN3_24590 ###
Partial Toll-like receptor. This gene model is located at the end of a short scaffold. The nucleotide sequence has 97% identity to another Sp-Tlr gene, so it could be a member of Toll-like receptor.
This is a member of sea urchin-specific Tlr Group IC.
###Gene_Info_Comments GLEAN3_24847 ###
Partial Toll-like receptor. This gene model is located at the end of a short scaffold. The nucleotide sequence has 96% identity to another Sp-Tlr gene, so it could be a member of Toll-like receptor.
###Gene_Info_Comments GLEAN3_23989 ###
Exons 3-7 are from this scaffold533 and GLEAN3_23989 prediction except for exon 5 which was only predicted by the Fgenesh++ prediction. Exon 2 is from scaffold65249 with no tracks predicting it. Exon 1 is incomplete and present on scaffold137005 with no tracks predicting it. Exons 8-19 are from  scaffold57107 and GLEAN3_20620 prediction. Refer to GLEAN3_20620 for the complete gene features of REJ2.
###Gene_Info_Comments GLEAN3_27735 ###
792bp nucleotides of 5'UTR that was highly similar to other Sp-Tlr genes was accepted to a coding region by BLASTN search. Modified gene model is located at the end of a contig, which could make the model still incomplete.
This is a member of sea urchin specific Tlr Group IA.
###Gene_Info_Comments GLEAN3_20718 ###
Exons 1-7 are present on this scaffold772 and GLEAN3_20718 prediction. Exons 8-28 are present on scaffold772 and GLEAN3_09002 prediction. Refer to GLEAN3_09002 for the complete gene features of REJ3.
###Gene_Info_Comments Sp-COP9signalosome/s6-like ###
Genes or parts of genes predicted by one or more of the prediction programs (Genscan, Gnomen, fgenesh) that are not in the Glean3 list.


###Gene_Info_Comments GLEAN3_10579 ###
Exons 1-12 and the begining of 13 are present on this scaffold601 and GLEAN3_10579 prediction. In the middle of exon 13 on scaffold601 there is a huge gap of N's, where the reamining exons should be present. The last 2 exons (named 14 and 15) are present on Scaffold17211. GLEAN3_18502 predicts exon 14 but not 15. Fgenesh++ predicts exon 15.
###Gene_Info_Comments Sp-EIF3S3-like ###
Genes or parts of genes predicted by one or more of the prediction programs (Genscan, Gnomen, fgenesh) that are not in the Glean3 list.
###Gene_Info_Comments Sp-angiotensin I converting enzyme-like ###
Genes or parts of genes predicted by one or more of the prediction programs (Genscan, Gnomen, fgenesh) that are not in the Glean3 list.
###Gene_Info_Comments GLEAN3_10180 ###
A part of the 3rd intron was accepted as coding region by comparison to the corresponding FgenesAB and ++ prediction. Unknown sequence (NNN...) in the 1st intron of the current model could make this gene model incomplete.
###Gene_Info_Comments Sp-mmp20-like ###
Genes or parts of genes predicted by one or more of the prediction programs (Genscan, Gnomen, fgenesh) that are not in the Glean3 list.


###Gene_Info_Comments GLEAN3_12211 ###
This gene model is located at the end of a short scaffold. Three exons were added at the 3'end of the GLEAN3 model by comparison to the corresponding Genscan prediction.
###Gene_Info_Comments Sp-NAALAD2 protease-like ###
Genes or parts of genes predicted by one or more of the prediction programs (Genscan, Gnomen, fgenesh) that are not in the Glean3 list.
###Gene_Info_Comments GLEAN3_18915 ###
Partial TNF receptor. This gene model is located at the end of a short scaffold
###Gene_Info_Comments Sp-leishmanolysin-like metalloprotease ###
Genes or parts of genes predicted by one or more of the prediction programs (Genscan, Gnomen, fgenesh) that are not in the Glean3 list.
###Gene_Info_Comments Sp-ACE-T-like ###
Genes or parts of genes predicted by one or more of the prediction programs (Genscan, Gnomen, fgenesh) that are not in the Glean3 list.
###Gene_Info_Comments Sp-angiotensin I converting enzyme isoform 1 precursor-like ###
Genes or parts of genes predicted by Genscan that are not in the Glean3 list.
###Gene_Info_Comments GLEAN3_10926 ###
Difficulty to confirm expression.
Highly similar to Glean3_13313.  
###Gene_Info_Comments GLEAN3_20740 ###
Unknown sequence (NNN...) in the intron of the current model could make this gene model incomplete.
###Gene_Info_Comments GLEAN3_24584 ###
Partial TNF receptor. This gene model is located at the end of a short scaffold. The first exon was changed by comparison to the corresponding FgenesAB and ++ prediction. 
###Gene_Info_Comments GLEAN3_10230 ###
This gene model was modified based on FgeneshAB ++ prediction. The model is located at the end of a scaffold, which could make it still incomplete.
###Gene_Info_Comments GLEAN3_20955 ###
Partial TNF receptor. This gene model is located at the end of a short scaffold. 
###Gene_Info_Comments GLEAN3_02328 ###
last exon has large stretch of NNNNs, will examine later for correction
###Gene_Info_Comments GLEAN3_06859 ###
Exons 1-4 are on this scaffold59004 and GLEAN3_036859. Exons 5-7 are on scaffold83735 and GLEAN3_21516.
###Gene_Info_Comments Sp-Twist ###
Determined by the sequence info from Lv-Twist
###Gene_Info_Comments GLEAN3_12238 ###
glean3_02950 is a partial sequence identical to this one, but on a different scaffold
###Gene_Info_Comments GLEAN3_19440 ###
Nectin sequence from L.variegatus hits 5 GLEAN predictions with a score of 0.0.  This prediction has 5 exons that are not in cDNA sequence, embryonic expression data agrees with cDNA and predicted protein mass is larger than that found in eggs.
###Gene_Info_Comments GLEAN3_11009 ###
Highest homology to CYP3 family genes. Phylogenetically falls at base of CYP3 family, within CYP clan 3 (MrBayes v3.0.1 WAG+gamma+I model). Percent ID (on masked basis) to CYP3A4 (human) of 42.3% (aa level). Gene model modified to include 2 more exons as predicted by FGENESH+ homology to GLEAN3_14573
###Gene_Info_Comments GLEAN3_13642 ###
Exons 5-12 are present on this scaffold85921 with GLEAN3_13642 and FgeneshAB predictions. Exons 1-4 are on scaffold6090 with GLEAN3_20643 and FgeneshAB predictions.
###Gene_Info_Comments GLEAN3_20643 ###
Exons 1-4 are on this scaffold6090 with GLEAN3_20643 and FgeneshAB predictions. Exons 5-12 are present on scaffold85921 with GLEAN3_13642 and FgeneshAB predictions. Refer to GLEAN3_13642 for the complete Sp-HCN2 gene features.
###Gene_Info_Comments GLEAN3_06462 ###
Scaffold1146 contains stretches of Ns in exon 1 of GLEAN3_06462 (GCM), and the length of this region is shorter than that of the corresponding region in experimentally verified mRNA sequence. Scaffold sequences and coordinate of exon 1 should be revised. 
###Gene_Info_Comments GLEAN3_10805 ###
duplicate GLEAN3_17694
###Gene_Info_Comments GLEAN3_17511 ###
Although Type2 TGFbeta receptors belongs to the "Ser/Thr Protein Kinase", and that this sequence gives TGFbetaR2 as best genebank hits, NCBI Blast predict a "Tyr Kinase" domain for this gene.
###Gene_Info_Comments GLEAN3_27254 ###
see GLEAN3_08876
###Gene_Info_Comments GLEAN3_08197 ###
Highly similar to GLEAN3_03264.
###Gene_Info_Comments GLEAN3_00960 ###
Glean3_00960 corresponds to C-ter domain of Sp-EF1B delta
Glean3_00960 contains 7 predicted exons. However only six correspond to the known sequences for S. granularis EF1B delta (Y14235 and AJ973181).
Among the 6 remaining exons, one exon (Scaffold1943|187113|187206|) is strictly identical to exon3 of Glean3_00959 (scaffold1943|189643|189735|) 

We therefore propose to construct the CDS for Sp-EF1B delta from 2 exons of Glean3_00959 plus the common exon between Glean3_00959 and Glean3_00960 plus 5 exons of Glean3_00960 as follows :
exon1 Glean3_00959 scaffold1943|194310|194423|Strand(-);  
exon2 Glean3_00959 scaffold1943|191630|191707|Strand(-); 
exon3 Glean3_00959 scaffold1943|189643|189735|Strand(-) 
 idem to Glean3_00960 scaffold1943|187113|187206|Strand(-);
exon4 Glean3_00960: Scaffold1943|186520|186614|Strand(-);
exon5 Glean3_00960: Scaffold1943|185409|185492|Strand(-);
exon6 Glean3_00960: Scaffold1943|183100|183319|Strand(-);
exon7 Glean3_00960: Scaffold1943|181425|181540|Strand(-);
exon8 Glean3_00960: Scaffold1943|180516|180554|Strand(-);


###Gene_Info_Comments GLEAN3_00229 ###
Closely related to GLEAN3_09517
Probably missing the N-terminus due to end of contig
###Gene_Info_Comments GLEAN3_04340 ###
This sequence was modified by adding the 5' sequence from GLEAN3_06789
###Gene_Info_Comments GLEAN3_26288 ###
Exons 2-9 are on this scaffold97931 and GLEAN3_26288 prediction. Exon 1 is on scaffold96 with GLEAN3_02214 prediction.
###Gene_Info_Comments GLEAN3_06789 ###
This sequence has been added to the 5' end of GLEAN3_04340
###Gene_Info_Comments GLEAN3_13709 ###
There was a long unknown sequence (NNN) between the 7th and 8th exons. The 1st to 7th exons were eliminated.
###Gene_Info_Comments GLEAN3_23670 ###
also GLEAN3_06070 
###Gene_Info_Comments GLEAN3_06070 ###
also GLEAN3_23670
###Gene_Info_Comments GLEAN3_03292 ###
also GLEAN3_16634
###Gene_Info_Comments GLEAN3_16634 ###
also GLEAN3_03292
###Gene_Info_Comments GLEAN3_17117 ###
probable atp-dependent helicase ddx41 (dead-box protein 41)
###Gene_Info_Comments GLEAN3_00526 ###
Exons 1-28 are on this scaffold80510 and GLEAN3_00526. Exons 29-58 are from scaffold1165 and GLEAN3_18112. Exons 49-58 are repeats and may be alternatively spliced. Exons 59-60 are from scaffold100796 and GLEAN3_08369.
###Gene_Info_Comments GLEAN3_05854 ###
The 3' end GLEAN3_05854 is located near the edge of the scaffold. By comparison to the published cDNA (AAB67801), the model is missing 3' exons coding for 106 C-terminal aa. Exon 6 was modified to agree with known cDNAs. 
###Gene_Info_Comments GLEAN3_27915 ###
GLEAN prediction originally for 386 to 806 (end).  The front end 1-386 is on GLEAN3_27915  Use Annotations on GLEAN 20342
###Gene_Info_Comments GLEAN3_20342 ###
GLEAN prediction originally for 386 to 806 (end).  The front end 1-386 is on GLEAN3_27915.  Exon 1 does not agree with cDNA sequence, two small intron exon boundary problems when aligned with BetaC cDNA
###Gene_Info_Comments GLEAN3_14418 ###
glean3_14418, glean3_27648 are the same gene, two different alleles. 
###Gene_Info_Comments GLEAN3_27648 ###
glean3_14418, glean3_27648 are the same gene, two different alleles. 
###Gene_Info_Comments GLEAN3_02292 ###
GLEAN3_02292 has a tandem duplication of the N-terminal half of the C2A domain.
###Gene_Info_Comments GLEAN3_15621 ###
partial CDS of N-terminal region of GLEAN3_24838
###Gene_Info_Comments GLEAN3_12856 ###
GLEAN3_12856 has a tandem duplication of the N-terminal half of the C2A domain.
###Gene_Info_Comments GLEAN3_11065 ###
CDS contains 26-631 or betaG cDNA sequence. First and last exons are missing from the scaffold
###Gene_Info_Comments GLEAN3_15213 ###
dna mismatch repair protein mlh1 (mutl protein homolog 1)
###Gene_Info_Comments GLEAN3_16009 ###
Prediction has small errors at exon boundaries by comparison with cDNA.  The last exon is incorrect.
###Gene_Info_Comments GLEAN3_28261 ###
GLEAN3_28261 is the C-terminal match to MSH6 mouse.
GLEAN3_12960 is the N-terminal match to MSH6 mouse.
###Gene_Info_Comments GLEAN3_26225 ###
Highest homology to CYP3 family genes. Phylogenetically falls at base of CYP3 family, within CYP clan 3 (MrBayes v3.0.1 WAG+gamma+I model). Percent ID (on masked basis) to CYP3A4 (human) of 45.0% (aa level).
###Gene_Info_Comments GLEAN3_21255 ###
Highest homology to CYP3 family genes. Phylogenetically falls at base of CYP3 family, within CYP clan 3 (MrBayes v3.0.1 WAG+gamma+I model). Percent ID (on masked basis) to CYP3A4 (human) of 37.9% (aa level). GLEAN3 gene model altered to follow FGENESH+ predictions and cDNA sequences (added exon 3 and 4).
###Gene_Info_Comments GLEAN3_21588 ###
Predicted C-terminus sequence is longer than those of the other organism. 
###Gene_Info_Comments GLEAN3_12985 ###
Part of a previously unknown beta subunit.  The 5' end is missing because scaffold is incomplete.  Good evidence for embryonic expression.
###Gene_Info_Comments GLEAN3_13102 ###
Highest homology to CYP3 family genes. Phylogenetically falls at base of CYP3 family, within CYP clan 3 (MrBayes v3.0.1 WAG+gamma+I model). Percent ID (on masked basis) to CYP3A4 (human) of 35.8% (aa level). By homology to other CYP3-like genes appears to be missing exons 1-4
###Gene_Info_Comments GLEAN3_09766 ###
See GLEAN3_13107 (scaffold 457). 
###Gene_Info_Comments GLEAN3_13107 ###
See GLEAN3_09766 (scaffold 98369). 
###Gene_Info_Comments GLEAN3_07644 ###
Different parts of the cDNA are found in different strands in a non-linear organization. 
###Gene_Info_Comments GLEAN3_11635 ###
Different parts of the cDNA are found in different strands in a non-linear organization. 
###Gene_Info_Comments GLEAN3_17379 ###
GLEAN3_17379 matches SIX1_mouse and SIX2_mouse in Homobox with high identity. 
###Gene_Info_Comments GLEAN3_17380 ###
Six5 also hits the same Glean3 gene.
###Gene_Info_Comments GLEAN3_12076 ###
This Glean model is a concatenation of 2 adjacent genes
The Cterminus is a Calcium channel alpha2 delta subunit.
the Nterminus is a homolog of human Q69YN2
###Gene_Info_Comments GLEAN3_14621 ###
GLEAN3_04346 could be an alternate model of this gene.
###Gene_Info_Comments GLEAN3_04346 ###
see also GLEAN3_14621
###Gene_Info_Comments GLEAN3_20140 ###
GLEAN3_22433 has high sequence homolgy and could be a variant.
###Gene_Info_Comments GLEAN3_18112 ###
Exons 29-58 are from this scaffold1165 and GLEAN3_18112. Exons 49-58 are repeats and may be alternatively spliced. Exons 1-28 are on scaffold80510 and GLEAN3_00526. Exons 59-60 are from scaffold100796 and GLEAN3_08369. Refer to GLEAN3_00526 for the complete Sp-EBR1 gene features.
###Gene_Info_Comments GLEAN3_22433 ###
see also GLEAN3_20140
###Gene_Info_Comments GLEAN3_09210 ###
This Model contains a partial sequence relative to GLEAN3_20140
###Gene_Info_Comments GLEAN3_27905 ###
Likely haplotype of GLEAN3_28897, but 27905 gene model incomplete. See GLEAN3_28897 (Sp-apn6) for annotation.
###Gene_Info_Comments GLEAN3_08369 ###
Exons 59-60 are from this scaffold100796 and GLEAN3_08369. Exons 1-28 are on scaffold80510 and GLEAN3_00526. Exons 29-58 are from scaffold1165 and GLEAN3_18112. Exons 49-58 are repeats and may be alternatively spliced. Refer to GLEAN3_00526 for the complete Sp-EBR1 gene features.
###Gene_Info_Comments GLEAN3_23808 ###
Highest homology to CYP3 family genes. Phylogenetically falls at base of CYP3 family, within CYP clan 3 (MrBayes v3.0.1 WAG+gamma+I model). Percent ID (on masked basis) to CYP3A4 (human) of 45.0% (aa level). Edited to add exons 2 and 3 based on FGENESH+ prediction and cDNA evidence, and exon 5 was shortened. Possible missing or mispredicted exon 1. Removed from gene model.
###Gene_Info_Comments GLEAN3_22916 ###
More than 90 % identity with Fz5/8 from P. lividus. AC number AM084899   
###Gene_Info_Comments GLEAN3_23898 ###
Scaffold75943 covers exons 3-25 by merging GLEAN3_23898 and GLEAN3_23899 modified predictions. Exons 3, 6,13, and 19 have sequence gaps. Scaffold6755 covers exons 1-2 which have no GLEAN3 predictions. 
###Gene_Info_Comments GLEAN3_16056 ###
Highest homology to CYP3 family genes. Phylogenetically falls at base of CYP3 family, within CYP clan 3 (MrBayes v3.0.1 WAG+gamma+I model). Percent ID (on masked basis) to CYP3A4 (human) of 42.3% (aa level). Missing exons 1 and 2 due to incomplete assembly.
###Gene_Info_Comments GLEAN3_07406 ###
Member of CYP1 family. Tentatively designated CYP1F1 pending CYP nomenclature committee approval. Phylogenetically falls at base of CYP1 family, within CYP clan 2 (MrBayes v3.0.1 WAG+gamma+I model). Percent ID (on masked basis) to CYP1C1  (fugu) of 37.4% (aa level).
###Gene_Info_Comments GLEAN3_06989 ###
Member of CYP1 family. Tentatively designated CYP1F2 pending CYP nomenclature committee approval. Phylogenetically falls at base of CYP1 family, within CYP clan 2 (MrBayes v3.0.1 WAG+gamma+I model). Percent ID (on masked basis) to CYP1A (Danio rerio) of 41.7% (aa level).
###Gene_Info_Comments GLEAN3_01262 ###
also GLEAN3_05908 is the cterminal portion only

GLEAN3_01262 annotated as Sp-Birc6 but aa5240-6588 similar to hypoxia-inducible factor 1 alpha (HIF-1a).  The assembly of nt sequence seems consistent with a modeled intronic region being, rather an intergenic region.  The HIF-1a was added annotation as such by M. Hahn. 
###Gene_Info_Comments GLEAN3_05908 ###
this glean model is a subset of GLEAN3_01262
###Gene_Info_Comments GLEAN3_20646 ###
aa 1-215 do not align to closest homologs
###Gene_Info_Comments GLEAN3_23899 ###
Scaffold75943 covers exons 3-25 by merging GLEAN3_23898 and GLEAN3_23899 modified predictions. Exons 3, 6,13, and 19 have sequence gaps. Scaffold6755 covers exons 1-2 which have no GLEAN3 predictions. Refer to GLEAN3_23898 for the complete gene features.
###Gene_Info_Comments GLEAN3_24610 ###
Ig2-Fz-TM-kinase
Best Blast hit is MUSK, which has one more Ig domain - gene may be missing N-terminal piece.
###Gene_Info_Comments GLEAN3_00310 ###
10 Ig-TM-kinase. Closest human protein is VEGFR-1, but only distantly related
###Gene_Info_Comments GLEAN3_21021 ###
7 Ig - TM - kinase
Best Blast hit is VEGFR1
###Gene_Info_Comments GLEAN3_01905 ###
BLASTP search shows this gene model has very low similarity to MyD88 in other animals, but domain structure shows it could be a member of MyD88 gene family. 
###Gene_Info_Comments GLEAN3_22684 ###
See GLEAN3_11320, _22683, _28351. 
###Gene_Info_Comments GLEAN3_16914 ###
Missing first 756bp of coding gene sequence.  Scaffold 42354,65902, and 40781 contain the first 756bp and scaffold 23542 contains 1420 to 2189bp of the coding region.
###Gene_Info_Comments GLEAN3_22683 ###
See GLEAN3_11320, _22684, _28351. 
###Gene_Info_Comments GLEAN3_28351 ###
See GLEAN3_11320, _22683, _22684. 
###Gene_Info_Comments GLEAN3_19779 ###
Overlaps with GLEAN3_23615
###Gene_Info_Comments GLEAN3_23615 ###
Overlaps with GLEAN3_19779
###Gene_Info_Comments GLEAN3_02050 ###
GLEAN3_02050|Scaffold111471|764|1126|corresponds to the last exon for encoding Sp-EF1A, 
other exons are contained in Glean3-00595 (see for complete gene)
###Gene_Info_Comments GLEAN3_00595 ###
EF1A is encoded by 5 exons in Glean3_00595(scaffold1277) plus one exon in Glean3_02050(scaffold111471)
the sequence was constructed on the basis of a fusion between the two scaffolds
###Gene_Info_Comments Arnone1 ###
exon 1-7 are on Scaffold22273; exon 2-8 are on Scaffold59839
###Gene_Info_Comments GLEAN3_12295 ###
See GLEAN3_15341 for annotation.
###Gene_Info_Comments GLEAN3_03996 ###
See GLEAN3_15341 for annotation
###Gene_Info_Comments GLEAN3_08008 ###
The protein aligns with dentin and other proteins with long stretches of serine/Asp/Gln. This is not real.
###Gene_Info_Comments GLEAN3_22707 ###
This gene model has a Death domain and two TIR domains, which indicates a member of MyD88 family.
###Gene_Info_Comments GLEAN3_04955 ###
 partial
###Gene_Info_Comments GLEAN3_26723 ###
See GLEAN#_08008 for annotation
###Gene_Info_Comments GLEAN3_11843 ###
Aligns to vertebrate dentin due to serine repeat-not true homology.
###Gene_Info_Comments GLEAN3_00567 ###
Partial CDS.  This model contains the N-terminal 2/3 of the CDS and all of the predicted exons are correct based on alignment data.  A full length copy of hatching enzyme gene is adjacent on scaffold581, GLEAN3_00566.  Where these sequences overlap, they are >90% identical at the amino acid level.
###Gene_Info_Comments GLEAN3_00343 ###
Allele of GLEAN3_07948
###Gene_Info_Comments Sp-astacin protease ###
Alignment with best blast sequence, a C. elegans hatching enzyme, suggests that the model may be incomplete at both N- and C- termini.
###Gene_Info_Comments GLEAN3_28132 ###
GLEAN3_28132 is located near the edge of the contig; it appears to be missing 5' exons. GLEAN3_16698 appears to be an allele representing the 5' end of this gene, the overlapping region is nearly identical sequence. Sp-Syt15-1 has three (predicted) C2 domains.
###Gene_Info_Comments GLEAN3_16698 ###
GLEAN3_16698 is located near the edge of the contig; it appears to be missing 3' exons. GLEAN3_28132 appears to be an allele representing the 3' end of this gene, the overlapping region is nearly identical sequence. Sp-Syt15-1 has three (predicted) C2 domains.
###Gene_Info_Comments GLEAN3_22210 ###
GLEAN3_22210 appears to be a tandem duplication of the 3' end of GLEAN3_22209, located just 5' to GLEAN3_22210. The 3' exons of these two gene models are identical sequence.
Sp-Syt15b only has one C2 domain.
###Gene_Info_Comments GLEAN3_21274 ###
Merge with GLEAN3_21274 and GLEAN3_21273 predictions.
###Gene_Info_Comments GLEAN3_21272 ###
Merge with GLEAN3_21273 and GLEAN3_21274.
###Gene_Info_Comments GLEAN3_07948 ###
Allele of GLEAN3_00343
###Gene_Info_Comments GLEAN3_19127 ###
Also related to electric ray synaptotagmin C, P24507.
###Gene_Info_Comments GLEAN3_04113 ###
This gene is in a cluster with 4 other genes closely related to SpAN which are called SpAN-like.  All of these genes encode proteins closely related to tolloid.
###Gene_Info_Comments GLEAN3_18035 ###
Contains a 60 kb intron. Appears to have an allele, GLEAN3_20349, without this long intron.
###Gene_Info_Comments GLEAN3_04114 ###
This gene is one of 4 clustered SpAN-like genes; this cluster also contains SpAN.  All of these genes encode proteins closely related to tolloid.
###Gene_Info_Comments GLEAN3_20349 ###
GLEAN3_20349 appears to have an allele, GLEAN3_18035, containing a 60 kb intron.
###Gene_Info_Comments GLEAN3_05832 ###
This gene model could not be a typical Toll-like receptor. There is only four LRR in the coding region and the nuleotides of the intron have no similarities.

###Gene_Info_Comments GLEAN3_05850 ###
Unknown sequence (NNN) in the first intron of the current model makes this gene model incomplete. The first exon only shows typical Toll-like receptor structures (signal peptide, LRRNT, LRR(22), LRRCT, TIR(partial)).
This is a member of sea urchin-specific Tlr Group IC.
###Gene_Info_Comments GLEAN3_09970 ###
This gene model was fused to an adjacent glean model (GLEAN3_09969) to obtain a full sequence. This fused gene model may represent a pseudogene or contain a sequence error. The sequence between these two models matches coding sequence of other Sp-Tlr genes and contains stop codons (frame shift).
###Gene_Info_Comments GLEAN3_09969 ###
This gene model was fused to an adjacent glean model (GLEAN3_09970) to obtain a full sequence. This fused gene model may represent a pseudogene or contain a sequence error. The sequence between these two models matches coding sequence of other Sp-Tlr genes and contains stop codons (frame shift).
###Gene_Info_Comments GLEAN3_24207 ###
This gene model was fused to an adjacent glean model (GLEAN3_024206) to obtain a full sequence. The nucleotide sequence between 024206 and 024207 has 94% identity to another Sp-Tlr gene. This fused gene model may represent a pseudogene or contain a sequence error, but reflect best gene structure.
This is a member of sea urchin-specific Tlr Group IC.
###Gene_Info_Comments GLEAN3_24206 ###
This gene model was fused to an adjacent glean model (GLEAN3_024207) to obtain a full sequence. The nucleotide sequence between 024206 and 024207 has 94% identity to another Sp-Tlr gene. This fused gene model may represent a pseudogene or contain a sequence error, but reflect best gene structure.
###Gene_Info_Comments GLEAN3_07599 ###
There is another predicted gene GLEAN3_15362  has some domain as this one, maybe there are different isoforms of Glass protein.
###Gene_Info_Comments GLEAN3_19810 ###
There are many homologs of this protein.
###Gene_Info_Comments GLEAN3_04115 ###
This is one of 4 SpAN-like genes in a cluster that also contains SpAN.  All of these genes encode proteins closely related to tolloid.
###Gene_Info_Comments GLEAN3_04116 ###
This is one of four clustered SpAN-like genes; the cluster also contains SpAN.  All of these genes encode proteins closely related to tolloid.
###Gene_Info_Comments GLEAN3_04117 ###
This is one of four SpAN-like genes in a cluster; the cluster also contains SpAN.  All five genes encode proteins that are closely related to tolloid.

Based on alignment with best blast sequence, it is likely that the model lacks the N-terminal exon.  Transcriptome expression data suggests that the missing exon lies between coordinates 13000 and 13500 on scaffold 61174.
###Gene_Info_Comments GLEAN3_26629 ###
Glean3_26630 immediately downstream also has homology to fibulin, but differs from 26629.
###Gene_Info_Comments GLEAN3_06628 ###
Partial CDS based on alignment with best blast data suggests that this model is missing both N- and C- exons.  The sequence inferred for the protease active site is significantly altered from other SpAN-like proteins or the closely related tolloid proteins.
###Gene_Info_Comments GLEAN3_23280 ###
Partial cds based on alignment with best blast sequence suggests that this model contains only a portion of the gene encoding the N-terminal half of the protein.  The model is located at one end of a short scaffold (73008) making it likely that the remainder of the gene is on another scaffold.
###Gene_Info_Comments GLEAN3_00881 ###
4 Glean3 models match the mvp sequence
GLEAN3_07085
GLEAN3_00881
GLEAN3_18647
and GLEAN3_18164 partial
###Gene_Info_Comments GLEAN3_18647 ###
4 Glean3 models match the mvp sequence
GLEAN3_07085
GLEAN3_00881
GLEAN3_18647
and GLEAN3_18164 partial
###Gene_Info_Comments GLEAN3_18164 ###
4 Glean3 models match the mvp sequence
GLEAN3_07085
GLEAN3_00881
GLEAN3_18647
and GLEAN3_18164 partial
###Gene_Info_Comments Sp-AN-like7 ###
Partial cds sequence based on alignment with best blast sequence suggests that this model encodes two CUB domains only.  The sequences of these domains are most closely related to those in SpAN, but it is not clear that they are linked to a metalloprotease domain.  Based on the position of this model on Scaffold16164, the remaining part of the gene is on another scaffold.
###Gene_Info_Comments Sp-AN-like8 ###
Partial cds sequence based on alignment with best blast sequence suggests that this model encodes two CUB domains only.  The sequences of these domains are most closely related to those in SpAN, but it is not clear that they are linked to a metalloprotease domain.  Based on the position of this model on Scaffold37352, the remaining part of the gene is on another scaffold.
###Gene_Info_Comments GLEAN3_15333 ###
Partial Toll-like receptor. This gene is located at the end of a short scafford.  Nucleotide seq has 94% similarity to another Sp-Tlr gene.
###Gene_Info_Comments GLEAN3_21495 ###
One of 4 genes containing multiple EGF and TB domains.  Appears that gene is truncated at 5 end due to scaffold being incomplete
###Gene_Info_Comments GLEAN3_01532 ###
One of two tandem genes containing EGF and TB repeats.
###Gene_Info_Comments GLEAN3_24479 ###
Partial Toll-like receptor. This gene is located at the end of a short scafford.  Nucleotide seq has 95% similarity to another Tlr gene.
###Gene_Info_Comments GLEAN3_01533 ###
One of two fibrillin genes.  Other parts of this same gene are in GLEAN3_01533, 20166, 21495
###Gene_Info_Comments GLEAN3_12550 ###
One of 4 genes containing multiple EGF and TB domains.
###Gene_Info_Comments GLEAN3_11455 ###
The nucleotide sequence of the first exon and the following intron have 100% identity to GLEAN3_11454. This gene model may be duplicated recently or produced by wrong prediction.
This is a member of sea urchin-specific Tlr Group IA.
###Gene_Info_Comments GLEAN3_13876 ###
Unknown sequence (NNN...) in the first intron of the current model could make this gene model incomplete. The first exon shows typical Toll-like receptor structures (signal peptide, LRRNT, LRR, LRRCT, TIR(partial)).
###Gene_Info_Comments GLEAN3_17530 ###
The first exon of this gene model was eliminated and a part of the intron was accepted as coding region by comparison to the corresponding FgenesAB and ++ prediction.
This is a member of sea urchin-specific Tlr Group I(orphan).

###Gene_Info_Comments GLEAN3_19661 ###
Partial Toll-like receptor. This gene model is located at the end of a short scaffold. The nucleotide sequence has 94% identity to another Sp-Tlr gene, so it could be a member of Toll-like receptor.
This is a member of sea urchin-specific Tlr Group IB. 
###Gene_Info_Comments GLEAN3_26630 ###
This may be a splice variant of Glean3_26629.
###Gene_Info_Comments GLEAN3_19882 ###
This gene model was fused to an adjacent glean model (GLEAN3_19881) to obtain a full sequence. This fused gene model may represent a pseudogene or contain a sequence error. The nucleotide sequence between 19881 and 82 matches coding sequence of other Sp-Tlr genes and contains stop codons (frame shift).
###Gene_Info_Comments GLEAN3_19881 ###
This gene model was fused to an adjacent glean model (GLEAN3_19882) to obtain a full sequence. This fused gene model may represent a pseudogene or contain a sequence error. The nucleotide sequence between 19881 and 82 matches coding sequence of other Sp-Tlr genes and contains stop codons (frame shift).
###Gene_Info_Comments GLEAN3_20258 ###
This gene model was fused to an adjacent glean model (GLEAN3_20257) to obtain a full sequence. This fused gene model may represent a pseudogene or contain a sequence error. The nucleotide sequence of these two models matches coding sequence of other Sp-Tlr genes and contains stop codons (frame shift).
This is a member of sea urchin-specific Tlr Group IB. 
###Gene_Info_Comments GLEAN3_20257 ###
This gene model was fused to an adjacent glean model (GLEAN3_20258) to obtain a full sequence. This fused gene model may represent a pseudogene or contain a sequence error. The nucleotide sequence of these two models matches coding sequence of other Sp-Tlr genes and contains stop codons (frame shift).
###Gene_Info_Comments GLEAN3_20654 ###
This gene model was fused to an adjacent glean model (GLEAN3_20653) to obtain a full sequence. This fused gene model may represent a pseudogene or contain a sequence error. The nucleotide sequence of these two models matches coding sequence of other Sp-Tlr genes and contains stop codons (frame shift).
This is a member of sea urchin specific Tlr Group IA.
###Gene_Info_Comments GLEAN3_20653 ###
This gene model was fused to an adjacent glean model (GLEAN3_20654) to obtain a full sequence. This fused gene model may represent a pseudogene or contain a sequence error. The nucleotide sequence of these two models matches coding sequence of other Sp-Tlr genes and contains stop codons (frame shift).
This is a member of sea urchin specific Tlr Group IA.
###Gene_Info_Comments GLEAN3_26338 ###
missing 5' of CDS
5' of CDS probably encoded by GLEAN3_00937 + GLEAN3_28620 or GLEAN3_12324

###Gene_Info_Comments GLEAN3_15080 ###
There are 10 ESTs.
Gene required correction from EST data
###Gene_Info_Comments GLEAN3_20849 ###
1 EST, CD342027 Has 1st exon and 5'UTR only

Gene model is missing the C-terminus

###Gene_Info_Comments GLEAN3_26767 ###
No ESTs
Model has only the nucleotide bing domain, is missing the C-terminal half od the protein.
###Gene_Info_Comments GLEAN3_12874 ###
No ESTs
Missing N-terminus.
###Gene_Info_Comments GLEAN3_24785 ###
1 EST DN561873, contins central half of the gene.
Duplicated exon in the middle of the gene removed

Missing N-terminus, has the wrong C-terminus
###Gene_Info_Comments GLEAN3_21184 ###
Amino Acid Sequence corrected.
2 ESTs. CD311111 overlaps GLEAN_14007

###Gene_Info_Comments GLEAN3_14013 ###
No ESTs
On the same contig as GLEAN_14007 another ABCG2-like gene.
Missing the N-terminus and Walker A domain
###Gene_Info_Comments GLEAN3_15930 ###
Merged with GLEAN3_15929 and GLEAN3_15931. Refer to GLEAN3_15929 for the complete gene features of REJ4.
###Gene_Info_Comments GLEAN3_15931 ###
Merged with GLEAN3_15929 and GLEAN3_15930. Refer to GLEAN3_15929 for the complete gene features of REJ4.
###Gene_Info_Comments GLEAN3_28620 ###
missing 5' of CDS (domain I; HS attachment sites)
5' of CDS probably encoded by GLEAN3_00937
missing 3' of CDS
3' of CDS probably encoded by GLEAN3_26338
haplotype duplication of GLEAN3_12324
GLEAN3_28620 is shorter than GLEAN3_12324 and is missing a 29 residue piece in the middle of the sequence that is present in GLEAN3_12324
###Gene_Info_Comments GLEAN3_09876 ###
Remove exons belonging to another gene.
###Gene_Info_Comments GLEAN3_09718 ###
CDSs of this gene align exactly with CDSs of GLEAN3_28124 on scaffold 75874. Sequence between CDS very similar also.
###Gene_Info_Comments GLEAN3_02758 ###
Similarity to vertebrate dentin due to serine repeats.
###Gene_Info_Comments GLEAN3_26442 ###
Aligns with vertebrate dentin primarily due to serine repeats. Alignment included.
###Gene_Info_Comments GLEAN3_19064 ###
Partial prediction for eIF4G. N-terminus part of the protein is predicted by GLEAN3_24859 (gene model modified)
###Gene_Info_Comments GLEAN3_12174 ###
Aligns with vertebrate dentin due to serine rich repeat region.
###Gene_Info_Comments GLEAN3_03725 ###
Missing exon 4 and 6. 
###Gene_Info_Comments GLEAN3_28124 ###
CDSs of this gene align exactly with CDSs of GLEAN3_09718 on scaffold 31633. Sequence between CDS very similar also.
###Gene_Info_Comments GLEAN3_16409 ###
CDS of minus strand of this gene align exactly with CDS of the positive strand of Glean3_16406, on same scaffold.  Sequence between CDS as well as the repeat structure in the two regions is very similar as well. This gene could be a haplotype that was put on the wrong scaffold. The contig that it is on is attached to the very end of the scaffold right next to Glean3_16406. 
###Gene_Info_Comments GLEAN3_18861 ###
The ligand binding domain is in glean3_25239 on scaffold46634.
###Gene_Info_Comments GLEAN3_16657 ###
see also Glean3_18404 for very similar glean
###Gene_Info_Comments GLEAN3_16032 ###
Aligns with vertebrate dentin due to serine rich repeat.
###Gene_Info_Comments GLEAN3_02821 ###
Aligns with vertebrate protein phosphatase 4 regulatory subunit and dentin due to repetitive regions.
###Gene_Info_Comments GLEAN3_04861 ###
Alignment to vertebrate dentin due to serine rich repeat.
###Gene_Info_Comments GLEAN3_15771 ###
Contains Domain of Unkown Function (DOF)- highly conserved in vertebrates.
Alignment to vertebrate dentin is due to serine rich repeat.
###Gene_Info_Comments GLEAN3_13358 ###
Alignment with vertebrate dentin is due to serine rich repeat in dentin.
###Gene_Info_Comments GLEAN3_19545 ###
Alignment to vertebrate dentin due to serine repeat in dentin.
Possible additional exons 5' to Glean3 prediction.
###Gene_Info_Comments GLEAN3_07317 ###
Blast data suggest that this model could be tolloid1 rather than suBMP1, as it is currently named based on cDNA sequence.  However, this conclusion is tentative because the model lacks the last three domains (EGF, CUB, CUB) domains characteristic of tolloid proteins.  This is undoubtedly because the scaffold is too short and these remaining exons are on another scaffold.

###Gene_Info_Comments GLEAN3_26920 ###
Tiling data suggests exons in regions of repeated units.  These areas require further examination to determine if they are truly exons.
###Gene_Info_Comments GLEAN3_19152 ###
because the sequence is so short it is possible this is actually a fragment
###Gene_Info_Comments GLEAN3_13651 ###
some similarities to a part of a peptidase_M1 domain, but sequence is very short
###Gene_Info_Comments GLEAN3_06947 ###
GLEAN3_02418 duplicate
###Gene_Info_Comments GLEAN3_11869 ###
Tiling data suggests multiple exons not found by the glean3 and NCBI modles, but the tiling data is too messy for me to alter the GLEAN3 modle  
###Gene_Info_Comments GLEAN3_05884 ###
great prediction of the N-terminus; c-terminus is in glean3_11914 (entered here)
###Gene_Info_Comments GLEAN3_10829 ###
77.8% identity with Aedes aegypti elongation factor 2 (AAK01430)


predicted exon1, i.e. GLEAN3_10829|Scaffold51549|194|325| has been deleted since it does not seem to exist in other eukaryotic EF2 sequences
The first exon in mRNA becomes GLEAN3_10829|Scaffold51549|2642|2856|which lacks a methionine

predicted exon6 GLEAN3_10829|Scaffold51549|6024|6059| has been deleted since it was a repeated sequence of exon7
###Gene_Info_Comments GLEAN3_11914 ###
this is the C-terminus of Sp-RACK, which will be completely annotated with its N-terminal prediction (glean3_05884)
###Gene_Info_Comments GLEAN3_00923 ###
a duplicated exon of Sp-RACK, fully annotated as glean3_05884

###Gene_Info_Comments GLEAN3_06221 ###
N-terminus probably truncated due to end of contig
###Gene_Info_Comments GLEAN3_06547 ###
Single exon gene encoding a partial copine. This is a possible pseudogene.
###Gene_Info_Comments GLEAN3_00906 ###
best match of 4 good matches
###Gene_Info_Comments GLEAN3_01821 ###
one match of 4
###Gene_Info_Comments GLEAN3_07865 ###
one of 4 matches no obvious haplotypes
###Gene_Info_Comments GLEAN3_07866 ###
one of 4 matches no obvious haplotypes
###Gene_Info_Comments GLEAN3_14221 ###
one of 4 matches no obvious haplotypes
###Gene_Info_Comments GLEAN3_28566 ###
one of 4 matches no obvious haplotypes
###Gene_Info_Comments GLEAN3_26438 ###
The ABCH subfamily is found only in insect, Dictyostelium and zebrafish, to date.
There are 2 ESTs.  One has an inserted sequence corresponding to an exon I can't find in the genomic sequence.

GLEAN model is missing the 3' part of the gene.
###Gene_Info_Comments GLEAN3_22633 ###
GLEAN3_22633 and GLEAN3_22634 appear to be a single gene (agrin) split into two models.  I have added the Gene features of 22634 to 22633.   
GLEAN3_02025 contains a single NtA domain like N-terminus of agrin. Other GLEAN  predictions contain FOLN and KAZAL repeats and may comprise the next segment (especially GLEAN3_02467 and possibly GLEAN3_24994).  A fourth gene looks like the next piece (GLEAN3_22633)and the adjacent gene (GLEAN3_22634) contains LamG repeats that look like the C-terminus. These five gene predictions may be adjacent and comprise a full agrin gene.
###Gene_Info_Comments GLEAN3_22634 ###
This appears to be the last 2 LamininG domains of Agrin.  The tandem gene, GLEAN3_22633 has been amended to include the exons originally predicted for this gene
###Gene_Info_Comments GLEAN3_23247 ###
Small scaffold
###Gene_Info_Comments GLEAN3_07946 ###
Complete 5' end sequence?
###Gene_Info_Comments GLEAN3_03882 ###
Very short scaffold
###Gene_Info_Comments GLEAN3_02955 ###
Alignment with vertebrate dentin due to serine rich repeat in dentin.
###Gene_Info_Comments Sp-PGRP5 ###
Small scaffold.
###Gene_Info_Comments GLEAN3_03669 ###
exons 107766-107856 & 108255-108332 & 132399-132485 do not have homology on a protein level to phospholipase C beta [Lytechinus pictus] but do have nucleotide level homology.  

GLEAN3_22715 is a haplotype of this gene

###Gene_Info_Comments GLEAN3_23216 ###
N-terminus of prediction is longer than nAChR of the other organism.  
###Gene_Info_Comments GLEAN3_19709 ###
See GLEAN3_01774. 
###Gene_Info_Comments GLEAN3_13095 ###
See GLEAN3_11220. 
###Gene_Info_Comments GLEAN3_04062 ###
uncorrect model : incomplete
unvalid exons and missing exons
some 3' exons in GLEAN3_05456

###Gene_Info_Comments GLEAN3_03655 ###
close to Fz1,Fz2,Fz7, orthology to be precisely determined
###Gene_Info_Comments GLEAN3_22373 ###
This model was fused to another glean model (GLEAN3_25278) to create a more accurate model for Sp-Triad. The Gene ID for such new model is: Sp-Triad.
###Gene_Info_Comments GLEAN3_25278 ###
This model was fused to another glean model (GLEAN3_22373) to create a more accurate model for Sp-Triad. The Gene ID for such new model is: Sp-Triad.
###Gene_Info_Comments Sp-Triad ###
This model was created by fusing two overlapping glean models based on a manual inspection of multiple protein sequence alignments. Redundant exons were taken out from the final sequence.
###Gene_Info_Comments GLEAN3_12877 ###
Other names: KIF27 and KIF7
This gene is part of the Hedgehog signaling pathway
###Gene_Info_Comments GLEAN3_03743 ###
PRP19/PSO4 pre-mRNA processing factor 19 homolog.
###Gene_Info_Comments GLEAN3_14295 ###
part of the Hedgehog signaling pathway
this model has been modified by adding the sequences of glean3_03312 and glean3_03313 in front of it's sequence.
###Gene_Info_Comments GLEAN3_26178 ###
Gene is intact!
No ESTs.  There are no introns, but the ORF seems intact.
This would be the first intronless ABC gene in a multicellular organism!
Chip data suggests this is an expressed sequence.
###Gene_Info_Comments GLEAN3_17959 ###
This is an excellent match, although alignment is missing for the first 34 N-term AA's.  Entry appears to encode a complete ORF, however.
###Gene_Info_Comments GLEAN3_16028 ###
First few exons appear irrelevant, matching to mannose receptor or extracellular proteins. A 200 aa bcl domian matches exactly with Mil2 (GLEAN3_01916).
###Gene_Info_Comments GLEAN3_03241 ###
Three ESTs. Two extend the 5' end, but the extra sequences are not on the contig (could be on another contig).
###Gene_Info_Comments GLEAN3_18342 ###
No ESTs
Gene apppears complete!!
###Gene_Info_Comments GLEAN3_16850 ###
No ESTs
Appears to have the wrong N-terminus
###Gene_Info_Comments GLEAN3_07357 ###
No ESTs
Extra exons and the wrong N-terminus, missing exon in NBF
Deleted dubious C-terminus
###Gene_Info_Comments GLEAN3_24666 ###
Missing C-term and N-term
Contig missassembled
3 ESTs but they do not assemble into a single contig
###Gene_Info_Comments GLEAN3_19656 ###
It seems a pseudogene. It contains a big part of the homeodomain, but nothing else similar to other genes.
The homeodomain sequences are identical to those in GLEAN3_24715. No Chip expression.
###Gene_Info_Comments GLEAN3_19327 ###
Gene contains a frameshift in element 17, which may make this a pseudogene.  The scaffold stops short around 50-100 nt before the stop codon.  
###Gene_Info_Comments GLEAN3_11836 ###
This 185 gene is partially present on Scaffold1870, but the location of the 3' end is unknown.  This scaffold contains the leader, intron, and part (elements 1-17) of the open reading frame.  The sequence contains subelement 15e, which makes it a member of the 185/333-E group, although the exact pattern is unknown due to the missing sequence.  The best BLAST hit is to Sp0368, or 185/333-E4.  A frameshift in element 1 may indicate that this is a pseudogene.
###Gene_Info_Comments GLEAN3_02238 ###
1 EST with the central portion of the gene.
Model is missing an exon in the NBD and perhaps the last exon.

###Gene_Info_Comments GLEAN3_26825 ###
1 EST
Missing N-terminus and C-terminus
###Gene_Info_Comments GLEAN3_01916 ###
First few exons appear irrelevant, matching to mannose receptor or extracellular proteins. A 200 aa bcl domian matches exactly with Mil1 (glean3_16028).
###Gene_Info_Comments GLEAN3_16525 ###
the evidence for this gene assignment is a similar domain organization
###Gene_Info_Comments GLEAN3_28724 ###
This model was annotated based on a manual inspection of multiple protein sequence alignments.

The three last exons of this model do not align with corresponding Pelle/IRAKs from other animal groups. An overlapping Fgenesh++ prediction (S.P_Scaffold255.seq.N000008) does not include these three exons (which fall on a separate predicted gene), and would therefore appear as a better model. These three exons of GLEAN3_28724 do not align strongly with any other protein, and they do not code for any detectable protein domain, which argues against them representing a separate gene. In addition, they are very strongly supported by tiling array data. For lack of better evidence for either alternative, we have decided to accept this glean model in its present form. It is however left to be determined experimentally whether these exons do correspond to Sp-Pik2 or a separate gene.

A closely related model (GLEAN3_00073) clusters with Irak4 in a multiple alignment tree. On the other hand, this model is equally distant to all other irak-related molecules, both based on alignments of their kinase domains or combined kinase and death domains. Because IRAKs and pelle proteins from insects are so similar, and because there is no clear co-clustering of both sea urchin homologs with specific Irak genes, we have decided to follow the approach taken for naming the C.elegans orthologs: to name them "Pik" genes after Pelle/Irak.
###Gene_Info_Comments GLEAN3_10283 ###
Alignment with vertebrate dentin due to serine-asp rich repeat.
###Gene_Info_Comments GLEAN3_15338 ###
Alignment to vertebrate dentin due to serine rich repeat in dentin.
###Gene_Info_Comments GLEAN3_00073 ###
This model was annotated based on a manual inspection of multiple protein sequence alignments.

There seems to be duplicated exons towards the C-terminus of this model. For lack of better evidence, we cannot currently determine whether these are truly duplicated exons or due to assembly errors. If these "duplicated" exons are taken out, then the alignment with murine IRAK4 is significantly improved.

This model clusters with Irak4 in a multiple alignment tree. On the other hand, a closely related model (GLEAN3_28724) is equally distant to all other irak-related molecules, both based on alignments of their kinase domains or combined kinase and death domains. Because IRAKs and pelle proteins from insects are so similar, and because there is no clear co-clustering of both sea urchin homologs with specific Irak genes, we have decided to follow the approach taken for naming the C.elegans orthologs: to name them "Pik" genes after Pelle/Irak.
###Gene_Info_Comments GLEAN3_08499 ###
Alignment to vertebrate dentin due to serine rich repeat in dentin.
EST sequences align with Glean3 model.
###Gene_Info_Comments GLEAN3_20082 ###
This model is part of a novel gene (Sp-Jak) that results from fusing GLEAN3_22023 and GLEAN3_20082. This modification was made based on a manual inspection of sequence alignments at both the aminoacidic and nucleotide levels.
###Gene_Info_Comments GLEAN3_22023 ###
This model is part of a novel gene (Sp-Jak) that results from fusing GLEAN3_22023 and GLEAN3_20082. This modification was made based on a manual inspection of sequence alignments at both the aminoacidic and nucleotide levels.
###Gene_Info_Comments Sp-Jak ###
This model was created as a fusion of GLEAN3_22023 and GLEAN3_20082, based on a manual inspection of sequence alignments at both the aminoacidic and nucleotide levels, both between these models and between each model and vertebrate JAKs.
###Gene_Info_Comments GLEAN3_28494 ###
Alignment with vertebrate dentin due to serine rich repeat.
###Gene_Info_Comments GLEAN3_00825 ###
Glean3_00825 was found to be very similar to previously cloned sea urchin SM30 genes.  Comparison to a previously isolated genomic clone (Akasaka et al 1994, JBC 269: 20592-20598) indicates that glean_00825 is probably not SM30-alpha or SM30-beta. Glean3_00825, glean3__00826, glean3_00827, and glean3_00828 encode SM30 like proteins and they are tandemly arranged on Scaffold25604.

Matched c-type lectin domain (cd00037).
###Gene_Info_Comments GLEAN3_19821 ###
Only found the C-terminal end of the protein, there are no evidences of the N-terminal end.
###Gene_Info_Comments GLEAN3_24454 ###
More of sequence on scaffolds 129415 and 21113
###Gene_Info_Comments GLEAN3_13788 ###
See Glean3_12296 for AHR-like model; see Glean3_05022 for bHLH domain of AHR or AHRR homolog.  Glean3_05022 could be the missing N-terminus of this model (Glean3_13788) or of Glean3_12296.
###Gene_Info_Comments GLEAN3_05762 ###
The C-terminal end of this sequence is also contained in GLEAN3_23715
###Gene_Info_Comments GLEAN3_17036 ###
incomplete
###Gene_Info_Comments GLEAN3_04230 ###
This glean result matches the C terminal reigion of Sp-PLC-delta.  The rest of the sequence is contained on scaffold 70915.  Part of scaffold 85759 is duplicated on scaffold 106154.  This gene was cloned by Coward et. al., 2003. Its accession number is NP_001008790.1.

***NOTE only This annotation contains the fully complete data.
###Gene_Info_Comments GLEAN3_12103 ###
This scaffold has the N terminal reigion of the PLC-delta sequence.  NOTE, The fully annotated sequence can be found on scaffold 85789.   Part of scaffold 85759 is duplicated on scaffold 106154.  This gene was cloned by Coward et. al., 2003. Its accession number is NP_001008790.1.

###Gene_Info_Comments GLEAN3_23044 ###
incomplete
###Gene_Info_Comments puromycin-sensitive aminopeptidease ###
partial CDS
###Gene_Info_Comments GLEAN3_18530 ###
incorrect 5' exon prediction
###Gene_Info_Comments GLEAN3_28479 ###
The predicted N-terminal region was incorrect. The predicted C-terminal region was largely incomplete. 
###Gene_Info_Comments GLEAN3_08595 ###
sequence incomplete
###Gene_Info_Comments GLEAN3_14645 ###
GLEAN3_07315 is almost identical but shorter, duplication most likely due to assembly process. 
###Gene_Info_Comments GLEAN3_12278 ###
sequence probably incomplete
###Gene_Info_Comments GLEAN3_07315 ###
Almost identical to GLEAN3_14645 but shorter, duplication most likely due to assembly process. 
###Gene_Info_Comments GLEAN3_05654 ###
We pulled the partial cDNA from an Sp egg library.  It aligns well with six of the middle Glean predictions.  Glean3_20904 also aligns well with the partial cDNA.
###Gene_Info_Comments GLEAN3_04867 ###
Similar to SM30-alpha. Adjacent to SpSM30-like-B (glean3_04869), but on the opposite strand.

Matches c-type lectin domain (cd00037) and pericardin like repeats (PR009765).
###Gene_Info_Comments GLEAN3_08747 ###
See putative conserved domains at:
http://www.ncbi.nlm.nih.gov/Structure/cdd/wrpsb.cgi?RID=1137538013-8866-40191825170.BLASTQ4
###Gene_Info_Comments GLEAN3_04869 ###
Similar to SM30-alpha. Adjacent to SpSM30-E (glean3_04867) but on the opposite strand.

Matches c-type lectin domain (smart00034).
###Gene_Info_Comments GLEAN3_07404 ###
Member of CYP1 family. Tentatively designated CYP1F6 pending CYP nomenclature committee approval. Phylogenetically falls at base of CYP1 family, within CYP clan 2 (MrBayes v3.0.1 WAG+gamma+I model). Percent ID (on masked basis) to   () of % (aa level). Single exon gene.
###Gene_Info_Comments GLEAN3_01365 ###
Duplicated Gene...see also GLEAN3_08207.
See putative conserved domains at:
http://www.ncbi.nlm.nih.gov/Structure/cdd/wrpsb.cgi?RID=1137535659-19561-20778833499.BLASTQ4
###Gene_Info_Comments GLEAN3_08207 ###
Duplicated gene...see also GLEAN3_01365
See putative conserved domains at:
http://www.ncbi.nlm.nih.gov/Structure/cdd/wrpsb.cgi?RID=1137537713-13786-83388701897.BLASTQ1
###Gene_Info_Comments GLEAN3_18810 ###
Appears most similar to SpSM32 (Illies et al.)  But it also contains the first exon of SpSM50 (glean3_18811).  SpSM32 and SpSM50 share a first exon.

Matches c-type lectin domain (cd00037).
###Gene_Info_Comments GLEAN3_07081 ###
Alignment with vertebrate dentin due to serine rich repeats.
Embryonic and larval EST sequences conform to gene prediction.

###Gene_Info_Comments GLEAN3_00154 ###
Lacking N-terminus.  See also GLEAN3_06123. 
###Gene_Info_Comments GLEAN3_02589 ###
See GLEAN3_06123, _00154 (C-terminus only). 
###Gene_Info_Comments GLEAN3_10811 ###
Alignments with known proteins, including vertebrate dentin, are due to high number of repetitive amino acids.
###Gene_Info_Comments GLEAN3_12324 ###
missing 5' of CDS (domain I; HS attachment sites)
5' of CDS probably encoded by GLEAN3_00937
missing 3' of CDS
3' of CDS probably encoded by GLEAN3_26338
haplotype duplication of GLEAN3_28620
GLEAN3_12324 is longer than GLEAN3_28620 and possesses a 29 residue piece in the middle of the sequence that is missing in GLEAN3_28620 (possible intron)
###Gene_Info_Comments GLEAN3_09723 ###
Alignments with known proteins, including vertebrate denti, dis due to repetitive amino acids.
###Gene_Info_Comments GLEAN3_04183 ###
See GLEAN3_06123, _02589, _00254. 
###Gene_Info_Comments GLEAN3_20107 ###
Missing N-terminus.  See GLEAN3_06123, _00154, _02589, _04183. 
###Gene_Info_Comments GLEAN3_17634 ###
Several high scoring hits, entry had a BLAST score of 0.00.
BLAST of ABP-120 revealed the same GLEAN3 prediction with a score of 0.00.
###Gene_Info_Comments GLEAN3_02605 ###
See GLEAN3_06123, _00154, _02589, _04183, _20107. 
###Gene_Info_Comments GLEAN3_19540 ###
Missing 230 amino acids found in mouse at the amino terminal.
###Gene_Info_Comments GLEAN3_09340 ###
Blasted with human homolog of Myosin IIIA and Myosin IIIB and obtained the same three highest scoring predictions.
###Gene_Info_Comments GLEAN3_19203 ###
N-terminus prediction is longer than those of other organism.  See GLEAN3_06123, _00154, _02589, _04183, _20107, _02605.  
###Gene_Info_Comments GLEAN3_22763 ###
See GLEAN3_06123, _00154, _02589, _04183, _20107, _02605, _19203. 
###Gene_Info_Comments GLEAN3_12184 ###
Missing N-terminus.  High scoring hits to AChE (GLEAN3_06123, _00154, _02589, _04183, _20107, _02605, _19203, _22763). 
###Gene_Info_Comments Sp-PLC-delta ###
Part of scaffold 85759 is duplicated on scaffold this scaffold (106154). NOTE  annotation on scaffold 85759 contains the fully complete data.  This gene was cloned by Coward et. al., 2003. Its accession number is NP_001008790.1.
###Gene_Info_Comments GLEAN3_07745 ###
Pfam00484.11 match.  

Transcriptome data indicate that it is expressed in embryos.

A family of carbonic anhydrase-like proteins exists in the sea urchin. Orthologous relationships with vertebrate carbonic anhydrases and related proteins have not been carefully examined.
###Gene_Info_Comments GLEAN3_14664 ###
N-terminus is missing.  See GLEAN3_14664.  
###Gene_Info_Comments GLEAN3_24978 ###
See GLEAN3_13765.
###Gene_Info_Comments GLEAN3_08910 ###
See GLEAN3_28455, _14664. 
###Gene_Info_Comments GLEAN3_18811 ###
Glean3_18811 is on the same scaffold as SM37. This is to be expected for SM50 [Lee et al (1999) Develop. Growth Differ 41: 303-312 PUB MED:10400392]. Glean3_18811 is most similar to the aa sequence of S. purpuratus SM50.  However it is missing the first exon and the intron one would expect from the canonical SM50 gene (Sucov et al and Katoh-Fukui et al.) Just 5' to this glean model is glean3_18810 that encodes SpSM32 which shares the first exon with SpSM50. Need to change this gene model to reflect this.

Matched c-type lectin domain (cd00037).
###Gene_Info_Comments GLEAN3_28783 ###
See GLEAN3_02787.  
###Gene_Info_Comments GLEAN3_28784 ###
See GLEAN3_13765, _24978.  
###Gene_Info_Comments GLEAN3_16008 ###
2 exons are missing :
exon 1, containing the signal peptide, found on scaffold 82776
exon 4 (instead of the 5 aa exon "GLFCF"), containing the transmembrane domain, found on scaffold 7868 
###Gene_Info_Comments GLEAN3_08058 ###
48% identity with corresponding region in human valyl-tRNA synthetase (VARS2)
45% identity with human VARS2-like
Sp-VARS isoformA has 47% identity with the Sp-VARSisoformB (glean3_02908) 

###Gene_Info_Comments GLEAN3_19101 ###
TWO COMMENTS:

1) Alignment with best blast hit sequence suggests that the first exon (see below) is incorrect. This conclusion is supported by transcriptome signals.
>GLEAN3_19101|Scaffold112071|10738|10828| DNA_SRC: Scaffold112071 START: 10738 STOP: 10828 STRAND: + 
CTGAAATTCTGAGATGTCGAATGGGGAGGTGCGTAGTGACAAGGGTGTGGTATAGGCCTAGCTGTAGAGG
ATGCTTCAGAAGAGCCGACAT

2) This model is adjacent to a very similar gene model,  GLEAN3_19102
###Gene_Info_Comments GLEAN3_21352 ###
This prediction of one of many ACE genes on scaffold 52540.
Alignment with best blast sequence suggests there may be a missing exon between the following exons in this model:
>GLEAN3_21352|Scaffold52540|47460|47651| DNA_SRC: Scaffold52540 START: 47460 STOP: 47651 STRAND: + 
CAAGGAGATGAGCGGGTATAGGTCCTGTGGGGTTAACCTTGTCCGTCCCGTACTTCTCTCCCAGCTTTCT
TCTCACAAAGGCATGGATCTGGAGGTACATCGGTTTGACTGCATCCCAAAGGGCGTCGATCTTCTCCACG
AAGTGTGGGTCTTCATAACGACGACGGAGTAAGTCGCCACGATCTTCATAAC
>GLEAN3_21352|Scaffold52540|49826|49971| DNA_SRC: Scaffold52540 START: 49826 STOP: 49971 STRAND: + 
GTGATTCGCTCATCAGGTGTTGAAGGCCTGGTTCCATGTTCAAGCATTGCTCCGATCGTTTCTTCCTCTG
TGCATGCTCTCTCTTCCGACAAACCTTTCCGGTAGCGAAGATAGTTGTCATGTTGTCCTGGACCTCACGT
TCTTTA

###Gene_Info_Comments GLEAN3_11641 ###
45% identity with human EF2 kinase (AAH32665)
###Gene_Info_Comments GLEAN3_21837 ###
ESTs used to confirm model only cover 5' portion of gene (first 6 exons) and 3' UTR, however tiling array correlates well with model predictions throughout.  Multiple splice varints likely exist since some ESTs contain the 3rd exon in the prediction and some do not.  Length of 3'UTR based on tiling array data and the presence of AAUAAA and CA at most 3' end.  Start of transcript indicated here is different than that in origional prediction and is based on EST data (BCM Exonerate CD304782 and CX555128) and tiling data.  The end of the CDS based on the existance of a TAA at site indicated.
###Gene_Info_Comments GLEAN3_26627 ###
Missing N-terminus and should be combined with GLEAN3_26626.  The prediction includes extra C-terminus.  
###Gene_Info_Comments GLEAN3_25999 ###
See GLEAN3_25315. 
###Gene_Info_Comments GLEAN3_17426 ###
Missing N-terminus.  
###Gene_Info_Comments GLEAN3_08988 ###
See GLEAN3_25315, _25999, _08988. 
###Gene_Info_Comments GLEAN3_07186 ###
Missing N-terminus.  See GLEAN3_25315, _25999, _08988.  
###Gene_Info_Comments GLEAN3_20213 ###
Missing N-terminus.  
###Gene_Info_Comments GLEAN3_23617 ###
Missing the N-terminal TM domins, they are off the contig.
There are two ESTs, one in the coding and one in the 3'UTR.
One exon in an NBD was deleted from the GLEAN model
###Gene_Info_Comments GLEAN3_05577 ###
Glean3_05577 may be partial- at end of contig. It is similar to a pair of genes, 09924, 09925, that are adjacent to one another on a contig. The other of the pair (with 05723) may be 05723, which is also on a small contig and likely truncated.
###Gene_Info_Comments GLEAN3_12621 ###
See Developmental Biology 204(1) 293-304 (1998) for more information.

Unable to verify all exons experimentally, but likely correct.
Published mRNA sequence does not extent along the scaffolding as far as the upstream UTR.
###Gene_Info_Comments Sp-Zic4-like ###
complete cds
###Gene_Info_Comments GLEAN3_26395 ###
This sequence represents only the N-terminus of the protein.  SP-ABCC1a has the full-length sequence of a closely related gene.
###Gene_Info_Comments GLEAN3_28797 ###
Unable to differentiate between ABCC8 and ABCC9 families.  This is orthologous to the human ABCC8/9 families.
###Gene_Info_Comments GLEAN3_05723 ###
Glean3_05723 may be partial- at end of contig. It is similar to a pair of genes, 09924, 09925, that are adjacent to one another on a contig. The other of the pair (with 05723) may be 05577, which is also on a small contig and likely truncated.
###Gene_Info_Comments GLEAN3_09924 ###
Lies adjacent to a highly similar gene, 09925, on the same contig.
###Gene_Info_Comments GLEAN3_09925 ###
Adjacent to Glean3_09924, a similar gene.
###Gene_Info_Comments GLEAN3_03744 ###
Likely assembly error. Identical over frpm AA 180 onwards to GLEAN3_03743. Recommend deletion?
###Gene_Info_Comments GLEAN3_24191 ###
The first 2000 bases on the 5' end of this gene match very closely with the 5' half of GLEAN3_20669.  The latter half of both genes are quite divergent from eachother.  This may be a case of duplication of all or half of one of these genes.  They are distant enough, however, that they are not being labeled as duplicates.
###Gene_Info_Comments GLEAN3_02411 ###
This gene is quite similar to GLEAN3_20669.  However, there are several instances of large insertions/deletions as well as numerous amino acid differences.  Perhaps it is a relatively recent duplication.  The number of differences make it unlikely that this is simply due to haplotype variation.
###Gene_Info_Comments GLEAN3_25903 ###
Unable to differentiate between ABCC8 and ABCC9 families.  This is orthologous to the human ABCC8/9 families.
###Gene_Info_Comments GLEAN3_08351 ###
This prediction  is incomplete. The 5' end of the protein is not predicted. Refer to the modified sequence of GLEAN3_28479 for the corrected sequence.
###Gene_Info_Comments GLEAN3_04417 ###
U5 snRNP-associated 102 kDa protein. First part of the gene on GLEAN3_24258. Latter part on GLEAN3_04417. Likely missing one exon between two parts.
###Gene_Info_Comments GLEAN3_21366 ###
U1 small nuclear ribonucleoprotein 70 kDa like. Likely missing 5' exon(s). Can't find the missing exon in protein predictions.
###Gene_Info_Comments GLEAN3_28726 ###
First 21 exons encode plexin.  The other 19 exons encode sortilin-1.  
###Gene_Info_Comments GLEAN3_27526 ###
N-terminus of this gene is GLEAN3_27525 and should be combined.   
###Gene_Info_Comments GLEAN3_27443 ###
Also, likely ortholog of plex B1.  
###Gene_Info_Comments GLEAN3_20916 ###
This gene model is possibly incomplete, since it is located at the end of the scaffold.
###Gene_Info_Comments GLEAN3_03553 ###
The CARD domain is located downstream of the NACHT. This is not seen in mammalian Nod1 and Nod2, where the CARD domain (or both) are located upstream of the NOD.
###Gene_Info_Comments GLEAN3_03539 ###
This gene model could be incomplete since it is located at the end of a scaffold.
###Gene_Info_Comments GLEAN3_06610 ###
This gene model could be incomplete since it is located at the end of a scaffold.
###Gene_Info_Comments GLEAN3_13619 ###
The DEATH domain encoded in this protein resembles the Dr5 receptor protein DEATH domain.

###Gene_Info_Comments Sp-VC1_2 ###
This gene model has been predicted by fgeneshAB and ++ that is not in the Glean3 list.
###Gene_Info_Comments Sp-Tnfsf_like4 ###
This gene model has been predicted by fgeneshAB and ++ that is not in the Glean3 list. The expression of this gene model has been confirmed with QPCR.
###Gene_Info_Comments GLEAN3_16163 ###
This model is possibly incomplete.
###Gene_Info_Comments GLEAN3_06529 ###
This gene includes most of Glean3_06530. The missing exons were added to this model. The last exon does not belong to gene model as well as the first exon of Glean3_06530. Fgenesh prediction is right. The Glean3_06530 model was not modified to reflect these discrepencies.
###Gene_Info_Comments GLEAN3_06530 ###
This gene model is incomplete and combines with parts of Glean3_06529 to make a complete gene model. The Glean3_06529 model was modified to include the exons in this model. Please refer to this gene model for further detail.
###Gene_Info_Comments GLEAN3_24075 ###
Fgenesh prediction contains one additional exon in the 5' end, which contains the signal peptide. This exon has been added to the glean3 model.
###Gene_Info_Comments Sp-NT1 ###
Neurotrophin found by Genscan but not by Glean.
Manual predictions agree with genscan except that the manual prediction has one main 3' exon containing most if not all of the translated part, whereas the genscan prediction will have the gene encoded by 2 separate exons.
###Gene_Info_Comments GLEAN3_10719 ###
Member of CYP1 family. Tentatively designated CYP1F3 pending CYP nomenclature committee approval. Phylogenetically falls at base of CYP1 family, within CYP clan 2 (MrBayes v3.0.1 WAG+gamma+I model). Percent ID (on masked basis) to CYP1A (Danio rerio) of 41.7% (aa level). Single exon, possibly part of tandem duplication.
###Gene_Info_Comments GLEAN3_10720 ###
Member of CYP1 family. Tentatively designated CYP1F4 pending CYP nomenclature committee approval. Phylogenetically falls at base of CYP1 family, within CYP clan 2 (MrBayes v3.0.1 WAG+gamma+I model). Percent ID (on masked basis) to CYP1A (Danio rerio) of 41.7% (aa level). Single exon, possibly part of tandem duplication.
###Gene_Info_Comments GLEAN3_10721 ###
Member of CYP1 family. Tentatively designated CYP1F5 pending CYP nomenclature committee approval. Phylogenetically falls at base of CYP1 family, within CYP clan 2 (MrBayes v3.0.1 WAG+gamma+I model). Percent ID (on masked basis) to CYP1A (Danio rerio) of 41.7% (aa level). Single exon, possibly part of tandem duplication.
###Gene_Info_Comments GLEAN3_11692 ###
Chip expression data indicates likely expression in embryo. Ortholog to honey bee voltage gated L type Ca channel. Related to Ca channel gene found in coral ( Stylophora pistillata) by Allemand grouop in Monaco.
###Gene_Info_Comments GLEAN3_07770 ###
Chip expression data does NOT confirm embryonic expression. Closely related to glean3_11692, which is expressed in embryo. Best hit is L type Ca channel in the snail, Limnea stagnalis. Also related to the coral Ca Channel gene described by Allemand's group.
###Gene_Info_Comments GLEAN3_19522 ###
Beginning of gene on GLEAN3_24095. End of gene is on GLEAN3_19522. Possible haplotype for GLEAN3_19522 is GLEAN3_07176
###Gene_Info_Comments GLEAN3_07176 ###
Beginning of gene on GLEAN3_24095. End of gene is on GLEAN3_19522. Possible haplotype for GLEAN3_19522 is GLEAN3_07176.
RECOMMEND DELETION
###Gene_Info_Comments GLEAN3_24095 ###
Beginning of gene on GLEAN3_24095. End of gene is on GLEAN3_19522. Possible haplotype for GLEAN3_19522 is GLEAN3_07176
###Gene_Info_Comments GLEAN3_06939 ###
This gene model was modified by comparison to the corresponding FgeneshAB prediction and other typical Sp-Tlr genes.
This gene model may represent a pseudogene or contain a sequence error. 
###Gene_Info_Comments GLEAN3_09600 ###
See GLEAN3_27526. 
###Gene_Info_Comments GLEAN3_19917 ###
Lacking N-terminus.  See GLEAN3_27526, _09600. 
###Gene_Info_Comments GLEAN3_00729 ###
Lacking N-terminus.  See GLEAN3_28726, _27443, _08472.  
###Gene_Info_Comments GLEAN3_18059 ###
See GLEAN3_12977.
###Gene_Info_Comments GLEAN3_28135 ###
polyadenylated histone H10
###Gene_Info_Comments GLEAN3_02431 ###
Possibly only a partial sequence.  Appears to only contain C-terminus.
###Gene_Info_Comments GLEAN3_08443 ###
Probably only a partial sequence.  It is missing ~200 amino acids off of the C-terminus and 400 off of the N-terminus.
###Gene_Info_Comments GLEAN3_06899 ###
This gene encodes a precursor for a vasopressin/oxytocin/vasotocin-like peptide CFISNCPKGamide, which I suggest is named Sp-echinotocin. This is first vasopressin/oxytocin/vasotocin-like peptide to be identified in an echinoderm. It is likely that similar or identical peptides will be found in other echinoderms, which I suggest are known collectively as echinotocins.

The GLEAN model of this gene with 4 exons is wrong because the N-terminal signal peptide is encoded by a putative internal exon (exon 2 of the GLEAN model). The model that I have produced is comprised of 3 exons with:
1. exon 1 encoding a signal peptide (confirmed by SignalP3.0 analysis), the echinotocin peptide and the N-terminal part of neurophysin.
2. exon 2 encoding the main middle portion of neurophysin
3. exon 3 encoding the C-terminal region of neurophysin.
The tiling data do not show signals that correspond with these exons, indicating that this gene is not expressed in the early stages of sea urchin development used as a source of mRNA for the tiling analysis.
No EST or cDNA data are available at present to confirm the prediction.

The model is however consistent with the structure of vasotocin/vasopressin/oxytocin genes in vertebrates, which are also comprised of 3 equivalent exons.
BLAST analysis GenBank with the predicted 165 amino-acid precursor shows a high level of sequence similarity with vasotocin precursors in fish.
###Gene_Info_Comments GLEAN3_15742 ###
divergently transcribed as a gene pair with GLEANH3_15741
Sp-late-histone-H3b
###Gene_Info_Comments GLEAN3_16523 ###
Classic zinc finger. Closest Ciona hit (ensembl:ENSCING00000008508)logged as Thyroid hormone receptor.
###Gene_Info_Comments GLEAN3_20803 ###
SMART Confidently predicted domains, repeats, motifs and features:

IG            begin: 134   end: 215 e-value 5.37e-04 
transmembrane begin: 359   end: 381 e-value - 
TyrKc         begin: 451   end: 721 e-value 3.46e-132
Belongs to class II of TK receptors isdefined by 
[DN]- [LIV]-x  (3)-Y-Y-R (Prosite PDOC00212) consensus activation loop. 

###Gene_Info_Comments GLEAN3_11485 ###
Huntington like gene in Sea Urchin shows more than 50 % similarity with Human huntingtin gene. This part of the gene codes for 844 amino acids making N terminal hunting protein from the start codon. The Scaffold 54379 which has this gene is reverse complimented. The C terminal part of this gene has to be identified which might continue in another scaffold.
###Gene_Info_Comments NGFFFamide-precursor ###
This gene encodes a putative neuropeptide precursor. It was identified by BLAST analysis of the sea urchin genomic sequence data using the sea cucumber neuropeptide NGIWYamide as the query. The precursor appears to encode two copies of a NGIWYamide-like peptide, which has the sequence NGFFFamide.
The gene comprises two exons, as predicted by gnomon; this is supported by the tiling data, which shows signals corresponding exactly with the two predicted protein coding exons.
The first exon encodes a puative N-terminal signal peptide, supported by SignalP3.0 analysis of the protein sequence.
The second exon encodes two copies of NGFFFamide in tandem, separated and bounded by dibasic cleavage sites. 
###Gene_Info_Comments GLEAN3_15064 ###
GLEAN3_15064 has the first part of the gene. GLEAN3_17197 should have the latter half. GLEAN3_15899 is likely a haplotype of GLEAN3_17197.
###Gene_Info_Comments GLEAN3_17197 ###
GLEAN3_15064 has the first part of the gene. GLEAN3_17197 should have the latter half. GLEAN3_15899 is likely a haplotype of GLEAN3_17197.
###Gene_Info_Comments GLEAN3_15899 ###
GLEAN3_15064 has the first part of the gene. GLEAN3_17197 should have the latter half. GLEAN3_15899 is likely a haplotype of GLEAN3_17197.
###Gene_Info_Comments GLEAN3_18213 ###
This gene model was modified based on Genscan and domain structures: the first exon + the following 200bp intron show a typical Toll-like receptor.  The third exon could belong to next glean model(18214). 
###Gene_Info_Comments GLEAN3_24258 ###
U5 snRNP-associated 102 kDa protein. First part of the gene on GLEAN3_24258. Latter part on GLEAN3_04417. Likely missing one exon between two parts.
###Gene_Info_Comments GLEAN3_18214 ###
This gene model was modified based on BLASTN search and domain structures: the third exon of GLEAN3_18213 + the upstream intron + this gene model + the gap between them show a typical Toll-like receptor.  This gene model may represent a pseudogene or contain a sequence error.
###Gene_Info_Comments GLEAN3_14430 ###
GLEAN3_20121 has the first part of the gene. GLEAN3_14430 has the rest of the gene.
###Gene_Info_Comments GLEAN3_16594 ###
U5 snRNP-specific protein, 116 kD. Haplotype of GLEAN3_14430
###Gene_Info_Comments GLEAN3_01709 ###
one nt sequencing error (or else is a very new pseudogene since the 3' end is correct
###Gene_Info_Comments GLEAN3_14099 ###
Given extreme similarity to TBP elsewhere in the genome, but many fewer nts suggests that this is an assembly error, not a real gene.  No transcription seen in the first two exons.  Plus we have evidence elsewhere that there is only one copy of TBP
###Gene_Info_Comments GLEAN3_07759 ###
probably incomplete sequence at c terminus
###Gene_Info_Comments GLEAN3_19208 ###
expressed histone gene
###Gene_Info_Comments GLEAN3_21279 ###
cleavage stage histone H3

Note I found the first 126 nts of the coding region on scaffold 24661

nts  14772 to 14897
###Gene_Info_Comments GLEAN3_07165 ###
incomplete sequence tructaed at 3'end 
could also be ortholog of mrp8
###Gene_Info_Comments GLEAN3_03833 ###
divergently transcribed gene pair with GLEAN3_03828
Sp-late-histone-H3e
###Gene_Info_Comments GLEAN3_03916 ###
incomplete
###Gene_Info_Comments GLEAN3_03407 ###
U5 small nuclear ribonucleoprotein 200 kDa helicase (U5 snRNP-specific 200 kDa protein) (U5-200KD) (Activating signal cointegrator 1 complex subunit 3-like 1).

GLEAN3_03407 and GLEAN3_06432 are orthologs of human U5-snRNP-200kDa protein.

GLEAN3_13364 and GLEAN3_28486 belong to second class similar to activating signal cointegrator 1.

GLEAN3_00121 and GLEAN3_17541 are unknown DEAD-box proteins similar to 200kDa snRNP protein and activating signal cointegrator 1 complex subunit 3.

GLEAN3_17541 is the longest predicted protein.
###Gene_Info_Comments GLEAN3_06432 ###
U5 small nuclear ribonucleoprotein 200 kDa helicase (U5 snRNP-specific 200 kDa protein) (U5-200KD) (Activating signal cointegrator 1 complex subunit 3-like 1).

GLEAN3_03407 and GLEAN3_06432 are orthologs of human U5-snRNP-200kDa protein.

GLEAN3_13364 and GLEAN3_28486 belong to second class similar to activating signal cointegrator 1.

GLEAN3_00121 and GLEAN3_17541 are unknown DEAD-box proteins similar to 200kDa snRNP protein and activating signal cointegrator 1 complex subunit 3.

GLEAN3_17541 is the longest predicted protein.
###Gene_Info_Comments GLEAN3_13364 ###
U5 small nuclear ribonucleoprotein 200 kDa helicase (U5 snRNP-specific 200 kDa protein) (U5-200KD) (Activating signal cointegrator 1 complex subunit 3-like 1).

GLEAN3_03407 and GLEAN3_06432 are orthologs of human U5-snRNP-200kDa protein.

GLEAN3_13364 and GLEAN3_28486 belong to second class similar to human U5-snRNP-200kDa protein and activating signal cointegrator 1.

GLEAN3_00121 and GLEAN3_17541 are likely orthologs of activating signal cointegrator 1 complex subunit 3.

GLEAN3_17541 is the longest predicted protein.
###Gene_Info_Comments GLEAN3_17541 ###
U5 small nuclear ribonucleoprotein 200 kDa helicase (U5 snRNP-specific 200 kDa protein) (U5-200KD) (Activating signal cointegrator 1 complex subunit 3-like 1).

GLEAN3_03407 and GLEAN3_06432 are orthologs of human U5-snRNP-200kDa protein.

GLEAN3_13364 and GLEAN3_28486 belong to second class similar to activating signal cointegrator 1.

GLEAN3_00121 and GLEAN3_17541 are unknown DEAD-box proteins similar to 200kDa snRNP protein and activating signal cointegrator 1 complex subunit 3.

GLEAN3_17541 is the longest predicted protein.
###Gene_Info_Comments GLEAN3_27604 ###
Alignment indicates the gene may be truncated at amino terminal- other gene models suggest additional sequences could be included at 5'end, but no cDNA data.
###Gene_Info_Comments GLEAN3_00121 ###
U5 small nuclear ribonucleoprotein 200 kDa helicase (U5 snRNP-specific 200 kDa protein) (U5-200KD) (Activating signal cointegrator 1 complex subunit 3-like 1).

GLEAN3_03407 and GLEAN3_06432 are orthologs of human U5-snRNP-200kDa protein.

GLEAN3_13364 and GLEAN3_28486 belong to second class similar to activating signal cointegrator 1.

GLEAN3_00121 and GLEAN3_17541 are unknown DEAD-box proteins similar to 200kDa snRNP protein and activating signal cointegrator 1 complex subunit 3.

GLEAN3_17541 is the longest predicted protein.
###Gene_Info_Comments GLEAN3_28486 ###
U5 small nuclear ribonucleoprotein 200 kDa helicase (U5 snRNP-specific 200 kDa protein) (U5-200KD) (Activating signal cointegrator 1 complex subunit 3-like 1).

GLEAN3_03407 and GLEAN3_06432 are orthologs of human U5-snRNP-200kDa protein.

GLEAN3_13364 and GLEAN3_28486 belong to second class similar to activating signal cointegrator 1.

GLEAN3_00121 and GLEAN3_17541 are unknown DEAD-box proteins similar to 200kDa snRNP protein and activating signal cointegrator 1 complex subunit 3.

GLEAN3_17541 is the longest predicted protein.
###Gene_Info_Comments Sp-Tlr223 ###
Partial Toll-like receptor predicted by FgeneshAB and Genscan. The nucleotides of this gene model have 92% identity to a typical Sp-Tlr gene (GLEAN3_12257). The model is located at the end of a short scaffold.

###Gene_Info_Comments Sp-Tlr224 ###
Partial Toll-like receptor. A part of this modified gene model is predicted by Genscan and NCBI. The nucleotides have 96% similarity to a typical Sp-Tlr gene (GLEAN3_10940). The model occupies all sequence of a short scaffold.

###Gene_Info_Comments Sp-Tlr225 ###
Partial Toll-like receptor predicted by FgeneshAB. The modified gene model occupies all sequence of a short scaffold and the nucleotides have 92% identity to a typical Sp-Tlr (GLEAN3_00615). 

###Gene_Info_Comments Sp-Tlr226 ###
Partial Toll-like receptor predicted by FgeneshAB and Genscan. The coding region of this gene model occupies all sequence of a short scaffold and the nucleotides have 89% identity to a typical Sp-Tlr (GLEAN3_23035). 

###Gene_Info_Comments Sp-Tlr227 ###
Partial Toll-like receptor predicted by Fgenesh++. The nucleotides of the coding region have 92% identity to a typical Sp-Tlr (GLEAN3_15066). This model is located at the end of a short scaffold. 
###Gene_Info_Comments GLEAN3_00009 ###
 partial, missing C-terminus
###Gene_Info_Comments GLEAN3_15139 ###
paralog of Sp-tachykinin-receptor-2 (GLEAN3_15140); is located on scaffold 49502 adjacent to Sp-tachykinin-receptor-2 (GLEAN3_15140), indicating that these two genes arose by recent gene duplication event
###Gene_Info_Comments GLEAN3_05990 ###
Matches c-type lectin domain (smart00034.10)
###Gene_Info_Comments GLEAN3_15140 ###
paralog of Sp-tachykinin-receptor-1 (GLEAN3_15139); is located on scaffold 49502 adjacent to Sp-tachykinin-receptor-1 (GLEAN3_15139), indicating that these two genes arose by recent gene duplication event
###Gene_Info_Comments GLEAN3_00025 ###
 partial, missing N- and C-terminus
###Gene_Info_Comments GLEAN3_00111 ###
 partial, missing N-terminus
###Gene_Info_Comments GLEAN3_00116 ###
 partial, missing N-terminus
###Gene_Info_Comments GLEAN3_00187 ###
 partial, missing C-terminus
###Gene_Info_Comments GLEAN3_00288 ###
 partial, missing C-terminus
###Gene_Info_Comments GLEAN3_00383 ###
 partial, missing N-terminus
###Gene_Info_Comments GLEAN3_00447 ###
 missing C-terminus
###Gene_Info_Comments GLEAN3_00455 ###
 extra stretch of aminoacids in middle
###Gene_Info_Comments GLEAN3_00590 ###
 missing C-terminus
###Gene_Info_Comments GLEAN3_00597 ###
 extra C-terminus
###Gene_Info_Comments GLEAN3_00607 ###
 partial, missing N-terminus
###Gene_Info_Comments GLEAN3_00612 ###
 partial, missing C-terminus half
###Gene_Info_Comments GLEAN3_00771 ###
 extra N-terminus
###Gene_Info_Comments GLEAN3_00772 ###
 extra N-terminus
###Gene_Info_Comments GLEAN3_00774 ###
 missing N- and C-terminus
###Gene_Info_Comments GLEAN3_00781 ###
 extra N-terminus
###Gene_Info_Comments GLEAN3_00799 ###
 missing some C-terminus residues
###Gene_Info_Comments GLEAN3_00830 ###
 partial, missing N-terminus
###Gene_Info_Comments GLEAN3_00835 ###
 extra C-terminus
###Gene_Info_Comments GLEAN3_01068 ###
 extra C-terminus
###Gene_Info_Comments GLEAN3_01141 ###
 partial, missing N- and C-terminus
###Gene_Info_Comments GLEAN3_01142 ###
 partial, missing N-terminus
###Gene_Info_Comments GLEAN3_01201 ###
 partial, missing some N-terminus residues
###Gene_Info_Comments GLEAN3_01214 ###
 extra N-terminus
###Gene_Info_Comments GLEAN3_01525 ###
 partial, missing N-terminus
###Gene_Info_Comments GLEAN3_01849 ###
 partial, missing N-terminus half
###Gene_Info_Comments GLEAN3_01860 ###
 partial, missing C-terminus
###Gene_Info_Comments GLEAN3_01864 ###
 missing N-terminus residues
###Gene_Info_Comments GLEAN3_01934 ###
 partial, missing N-terminus
###Gene_Info_Comments GLEAN3_02030 ###
 partial, missing C-terminus
###Gene_Info_Comments GLEAN3_02051 ###
 partial, missing C-terminus
###Gene_Info_Comments GLEAN3_02142 ###
 partial, missing N- and C-terminus
###Gene_Info_Comments GLEAN3_02201 ###
 partial, missing C-terminus
###Gene_Info_Comments GLEAN3_02209 ###
 partial, missing C-terminus
###Gene_Info_Comments GLEAN3_02338 ###
 partial, missing N-terminus
###Gene_Info_Comments GLEAN3_02429 ###
 partial, missing C-terminus
###Gene_Info_Comments GLEAN3_02491 ###
 partial, missing some N-terminus residues
###Gene_Info_Comments GLEAN3_02546 ###
 missing stretch in middle
###Gene_Info_Comments GLEAN3_02650 ###
 missing C-terminus
###Gene_Info_Comments GLEAN3_02686 ###
 extra residues on N-terminus
###Gene_Info_Comments GLEAN3_02736 ###
 partial, missing C-terminus
###Gene_Info_Comments GLEAN3_02773 ###
 partial, missing C-terminus
###Gene_Info_Comments GLEAN3_02881 ###
 partial, missing N-terminus
###Gene_Info_Comments GLEAN3_02901 ###
 partial, missing C-terminus
###Gene_Info_Comments GLEAN3_02903 ###
 partial, missing N-terminus
###Gene_Info_Comments GLEAN3_02909 ###
 partial, missing N-terminus
###Gene_Info_Comments GLEAN3_02911 ###
 partial, missing C-terminus
###Gene_Info_Comments GLEAN3_02963 ###
 extra stretch in middle
###Gene_Info_Comments GLEAN3_02998 ###
 missing stretch in middle
###Gene_Info_Comments GLEAN3_03021 ###
 extra stretch in middle
###Gene_Info_Comments GLEAN3_03138 ###
 partial, missing C-terminus
###Gene_Info_Comments GLEAN3_03325 ###
 extra stretch in middle
###Gene_Info_Comments GLEAN3_03329 ###
 partial, missing N- and C-terminus residues
###Gene_Info_Comments GLEAN3_03397 ###
 partial, missing N-terminus
###Gene_Info_Comments GLEAN3_03415 ###
 partial, missing N-terminus
###Gene_Info_Comments GLEAN3_03444 ###
 partial, missing N-terminus
###Gene_Info_Comments GLEAN3_03457 ###
 partial, missing N-terminus
###Gene_Info_Comments GLEAN3_03561 ###
 partial, missing C-terminus
###Gene_Info_Comments GLEAN3_03632 ###
 missing N-terminus residues
###Gene_Info_Comments GLEAN3_03667 ###
 partial, missing C-terminus
###Gene_Info_Comments GLEAN3_03712 ###
 partial, missing N-terminus residues
###Gene_Info_Comments GLEAN3_03714 ###
 missing N- and C-terminus
###Gene_Info_Comments GLEAN3_03785 ###
 missing N- and C-terminus
###Gene_Info_Comments GLEAN3_03839 ###
 extra N-terminus, missing C-terminus
###Gene_Info_Comments GLEAN3_03913 ###
 missing two stretches in middle
###Gene_Info_Comments GLEAN3_03922 ###
 partial, missing N-terminus
###Gene_Info_Comments GLEAN3_03982 ###
 partial, missing C-terminus
###Gene_Info_Comments GLEAN3_04022 ###
 partial, missing N- and C-terminus
###Gene_Info_Comments GLEAN3_04075 ###
 partial, missing N- and C-terminus
###Gene_Info_Comments GLEAN3_04123 ###
 partial, missing N-terminus
###Gene_Info_Comments GLEAN3_04216 ###
 partial, missing N-terminus
###Gene_Info_Comments GLEAN3_04271 ###
 missing N-terminus
###Gene_Info_Comments GLEAN3_04304 ###
 missing N-terminus
###Gene_Info_Comments GLEAN3_04310 ###
 partial, missing C-terminus
###Gene_Info_Comments GLEAN3_04326 ###
 missing stretch in middle
###Gene_Info_Comments GLEAN3_04376 ###
 partial, missing C-terminus
###Gene_Info_Comments GLEAN3_04416 ###
 partial, missing N-terminus half
###Gene_Info_Comments GLEAN3_04593 ###
 partial, missing N-terminus
###Gene_Info_Comments GLEAN3_04673 ###
 partial, missing N-terminus
###Gene_Info_Comments GLEAN3_04694 ###
 missing N-terminus residues
###Gene_Info_Comments GLEAN3_04720 ###
 partial, missing N-terminus
###Gene_Info_Comments GLEAN3_04769 ###
 partial, missing C-terminus
###Gene_Info_Comments GLEAN3_04771 ###
 extra stretch in middle
###Gene_Info_Comments GLEAN3_04795 ###
 extra stretches in middle
###Gene_Info_Comments GLEAN3_04854 ###
 missing stretch in middle
###Gene_Info_Comments GLEAN3_04877 ###
 partial, missing N-terminus
###Gene_Info_Comments GLEAN3_04892 ###
 partial, missing N-terminus
###Gene_Info_Comments GLEAN3_04946 ###
 partial, missing C-terminus
###Gene_Info_Comments GLEAN3_04965 ###
 missing N-terminus residues
###Gene_Info_Comments GLEAN3_05056 ###
 partial, missing N- and C-terminus
###Gene_Info_Comments GLEAN3_05177 ###
 missing N-terminus residues
###Gene_Info_Comments GLEAN3_05203 ###
 partial, missing N-terminus
###Gene_Info_Comments GLEAN3_05246 ###
 missing N- and C-terminus residues
###Gene_Info_Comments GLEAN3_05253 ###
 partial, missing N-terminus
###Gene_Info_Comments GLEAN3_05261 ###
 partial, missing C-terminus
###Gene_Info_Comments GLEAN3_05274 ###
 partial, missing N-terminus
###Gene_Info_Comments GLEAN3_05285 ###
 partial, missing N-terminus
###Gene_Info_Comments GLEAN3_05331 ###
 partial, missing N-terminus
###Gene_Info_Comments GLEAN3_05356 ###
 partial, missing N-terminus
###Gene_Info_Comments GLEAN3_05407 ###
 partial, missing N-terminus
###Gene_Info_Comments GLEAN3_05423 ###
 missing N-terminus
###Gene_Info_Comments GLEAN3_05523 ###
 extra N-terminus
###Gene_Info_Comments GLEAN3_05534 ###
 missing stretch in middle
###Gene_Info_Comments GLEAN3_05665 ###
 partial, missing N-terminus
###Gene_Info_Comments GLEAN3_05675 ###
 partial, missing stretch in middle
###Gene_Info_Comments GLEAN3_05757 ###
 missing N-terminus residues
###Gene_Info_Comments GLEAN3_05811 ###
 partial, missing N- and C-terminus
###Gene_Info_Comments GLEAN3_05848 ###
 partial, missing N-terminus
###Gene_Info_Comments GLEAN3_05852 ###
 extra stretch in middle
###Gene_Info_Comments GLEAN3_05863 ###
 missing stretch in middle
###Gene_Info_Comments GLEAN3_05952 ###
 partial, missing C-terminus
###Gene_Info_Comments GLEAN3_05977 ###
 missing stretch in middle
###Gene_Info_Comments GLEAN3_06017 ###
 partial, missing N-terminus
###Gene_Info_Comments GLEAN3_06102 ###
 missing N-terminus, extra stretch in middle
###Gene_Info_Comments GLEAN3_06210 ###
 missing N-terminus residues
###Gene_Info_Comments GLEAN3_06217 ###
 missing C-terminus residues
###Gene_Info_Comments GLEAN3_06291 ###
 missing N- and C-terminus
###Gene_Info_Comments GLEAN3_06301 ###
 partial, missing C-terminus
###Gene_Info_Comments GLEAN3_06339 ###
 partial, missing C-terminus
###Gene_Info_Comments GLEAN3_06361 ###
 partial, missing C-terminus
###Gene_Info_Comments GLEAN3_06381 ###
 partial, missing N-terminus
###Gene_Info_Comments GLEAN3_06383 ###
 partial, missing N-terminus
###Gene_Info_Comments GLEAN3_06414 ###
 partial, missing N-terminus and a stretch in middle
###Gene_Info_Comments GLEAN3_06426 ###
 partial, missing N-terminus
###Gene_Info_Comments GLEAN3_06437 ###
 partial, missing C-terminus
###Gene_Info_Comments GLEAN3_06509 ###
 missing stretch in middle
###Gene_Info_Comments GLEAN3_06565 ###
 partial, missing C-terminus
###Gene_Info_Comments GLEAN3_06585 ###
 extra stretch in middle
###Gene_Info_Comments GLEAN3_06603 ###
 partial, missing N-terminus
###Gene_Info_Comments GLEAN3_06630 ###
 partial, missing C-terminus
###Gene_Info_Comments GLEAN3_06663 ###
 partial, missing N-terminus
###Gene_Info_Comments GLEAN3_06693 ###
 partial, missing N-terminus
###Gene_Info_Comments GLEAN3_06790 ###
 extra N-terminus
###Gene_Info_Comments GLEAN3_06852 ###
 extra stretch in middle
###Gene_Info_Comments GLEAN3_06888 ###
 extra N-terminus
###Gene_Info_Comments GLEAN3_06890 ###
 extra C-terminus
###Gene_Info_Comments GLEAN3_06973 ###
 partial, missing N-terminus, stretches in middle
###Gene_Info_Comments GLEAN3_06985 ###
 partial, missing N-terminus
###Gene_Info_Comments GLEAN3_07041 ###
 missing C-terminus
###Gene_Info_Comments GLEAN3_07114 ###
 partial, missing C-terminus
###Gene_Info_Comments GLEAN3_07122 ###
 partial, missing C-terminus
###Gene_Info_Comments GLEAN3_07150 ###
 partial, missing C-terminus
###Gene_Info_Comments GLEAN3_07155 ###
 partial, missing C-terminus
###Gene_Info_Comments GLEAN3_07228 ###
 partial, missing stretches in middle
###Gene_Info_Comments GLEAN3_07245 ###
 extra N-terminus
###Gene_Info_Comments GLEAN3_07246 ###
 partial, missing C-terminus
###Gene_Info_Comments GLEAN3_07249 ###
 partial, missing C-terminus, extra N-terminus
###Gene_Info_Comments GLEAN3_07284 ###
 extra stretch in middle
###Gene_Info_Comments GLEAN3_07499 ###
 partial, missing N-terminus
###Gene_Info_Comments GLEAN3_07511 ###
partial, missing N-terminus
###Gene_Info_Comments GLEAN3_07583 ###
 partial, extra N-terminus, missing C-terminus
###Gene_Info_Comments GLEAN3_07672 ###
 partial, missing N-terminus
###Gene_Info_Comments GLEAN3_07687 ###
 missing N-terminus residues
###Gene_Info_Comments GLEAN3_07756 ###
 missing N-terminus
###Gene_Info_Comments GLEAN3_07774 ###
 partial, missing N-terminus
###Gene_Info_Comments GLEAN3_07833 ###
 partial, misisng N- and C-terminus
###Gene_Info_Comments GLEAN3_07905 ###
 partial, missing N-terminus
###Gene_Info_Comments GLEAN3_07906 ###
 partial, missing N-terminus and stretch in middle
###Gene_Info_Comments GLEAN3_07909 ###
 partial, missing N- and C-terminus
###Gene_Info_Comments GLEAN3_07920 ###
 missing N- and C-terminus
###Gene_Info_Comments GLEAN3_07927 ###
 missing C-terminus
###Gene_Info_Comments GLEAN3_08009 ###
 missing stretch in middle
###Gene_Info_Comments GLEAN3_08080 ###
 extra N-terminus, missing C-terminus
###Gene_Info_Comments GLEAN3_08094 ###
 missing N-terminus
###Gene_Info_Comments GLEAN3_08135 ###
 extra N-terminus
###Gene_Info_Comments GLEAN3_08142 ###
 missing C-terminus
###Gene_Info_Comments GLEAN3_08176 ###
 missing stretch in middle
###Gene_Info_Comments GLEAN3_08219 ###
 missing N-terminus and stretch in middle
###Gene_Info_Comments GLEAN3_08265 ###
 missing N-terminus
###Gene_Info_Comments GLEAN3_08318 ###
 missing C-terminus
###Gene_Info_Comments GLEAN3_08349 ###
 missing N-terminus residues
###Gene_Info_Comments GLEAN3_08403 ###
 missing N-terminus
###Gene_Info_Comments GLEAN3_08458 ###
 missing C-terminus
###Gene_Info_Comments GLEAN3_08464 ###
 missing C-terminus
###Gene_Info_Comments GLEAN3_08406 ###
 missing C-terminus
###Gene_Info_Comments GLEAN3_08542 ###
 missing C-terminus
###Gene_Info_Comments GLEAN3_08553 ###
 missing stretch in middle
###Gene_Info_Comments GLEAN3_08554 ###
 missing C-terminus
###Gene_Info_Comments GLEAN3_08585 ###
 missing N-terminus
###Gene_Info_Comments GLEAN3_08616 ###
 missing N-terminus residues
###Gene_Info_Comments GLEAN3_08646 ###
 missing C-terminus
###Gene_Info_Comments GLEAN3_08701 ###
 missing C-terminus, missing stretch in middle
###Gene_Info_Comments GLEAN3_08705 ###
 missing N- and C-terminus
###Gene_Info_Comments GLEAN3_08815 ###
 extra N-terminus
###Gene_Info_Comments GLEAN3_08875 ###
 extra N-terminus
###Gene_Info_Comments GLEAN3_08879 ###
 extra stretch in middle
###Gene_Info_Comments GLEAN3_08970 ###
 missing N- and C-terminus, missing small stretch in middle
###Gene_Info_Comments GLEAN3_09066 ###
 missing N-terminus residues
###Gene_Info_Comments GLEAN3_09082 ###
 missing C-terminus
###Gene_Info_Comments GLEAN3_09099 ###
 missing stretch in middle, missing C-terminus
###Gene_Info_Comments GLEAN3_09100 ###
 missing N- and C-terminus
###Gene_Info_Comments GLEAN3_09114 ###
 missing N-terminus
###Gene_Info_Comments GLEAN3_09131 ###
 missing N-terminus residues
###Gene_Info_Comments GLEAN3_09249 ###
 missing N-terminus, missing stretch in middle
###Gene_Info_Comments GLEAN3_09359 ###
 partial, missing N-terminus;
###Gene_Info_Comments GLEAN3_09372 ###
 extra N-terminus
###Gene_Info_Comments GLEAN3_09392 ###
 missing N- and C-terminus, extra stretch in middle
###Gene_Info_Comments GLEAN3_09437 ###
 partial, missing N-terminus
###Gene_Info_Comments GLEAN3_09462 ###
 partial, missing N-terminus
###Gene_Info_Comments GLEAN3_09491 ###
 partial, missing C-terminus
###Gene_Info_Comments GLEAN3_09564 ###
 partial, missing C-terminus
###Gene_Info_Comments GLEAN3_09617 ###
 partial, missing N-terminus
###Gene_Info_Comments GLEAN3_09644 ###
 extra N-terminus
###Gene_Info_Comments GLEAN3_09768 ###
 partial, missing N- and C-terminus
###Gene_Info_Comments GLEAN3_09773 ###
 partial, missing N-terminus
###Gene_Info_Comments GLEAN3_09853 ###
 partial, missing N-terminus
###Gene_Info_Comments GLEAN3_09929 ###
 partial, missing N- and C-terminus
###Gene_Info_Comments GLEAN3_09998 ###
 missing N-terminus
###Gene_Info_Comments GLEAN3_10108 ###
 partial, missing N- and C-terminus
###Gene_Info_Comments GLEAN3_10205 ###
 missing N-terminus
###Gene_Info_Comments GLEAN3_10209 ###
 missing N-terminus residues
###Gene_Info_Comments GLEAN3_10247 ###
 missing N-terminus
###Gene_Info_Comments GLEAN3_10311 ###
 missing stretch in middle
###Gene_Info_Comments GLEAN3_10500 ###
 missing stretch in middle
###Gene_Info_Comments GLEAN3_10557 ###
 partial, missing N-terminus
###Gene_Info_Comments GLEAN3_10618 ###
 missing N-terminus
###Gene_Info_Comments GLEAN3_10627 ###
 extra stretches in middle
###Gene_Info_Comments GLEAN3_10742 ###
 missing stretch in middle
###Gene_Info_Comments GLEAN3_10804 ###
 missing N-terminus residues
###Gene_Info_Comments GLEAN3_11041 ###
 partial, missing N-terminus
###Gene_Info_Comments GLEAN3_11234 ###
 missing N-terminus
###Gene_Info_Comments GLEAN3_11283 ###
 missing short stretch in middle
###Gene_Info_Comments GLEAN3_11329 ###
 missing C-terminus
###Gene_Info_Comments GLEAN3_11566 ###
 missing N-terminus residues
###Gene_Info_Comments GLEAN3_11589 ###
 partial, missing C-terminus
###Gene_Info_Comments GLEAN3_11596 ###
 partial
###Gene_Info_Comments GLEAN3_11713 ###
 partial, missing C-terminus;
###Gene_Info_Comments GLEAN3_11928 ###
 partial, missing C-terminus
###Gene_Info_Comments GLEAN3_11980 ###
 partial, missing N-terminus
###Gene_Info_Comments GLEAN3_12279 ###
 missing some N-terminus residues
###Gene_Info_Comments GLEAN3_12344 ###
 missing some N-terminus residues
###Gene_Info_Comments GLEAN3_12348 ###
 missing some N-terminus residues
###Gene_Info_Comments GLEAN3_12369 ###
 missing stretch middle
###Gene_Info_Comments GLEAN3_12390 ###
 partial, missing N-terminus
###Gene_Info_Comments GLEAN3_12615 ###
 missing stretch in middle
###Gene_Info_Comments GLEAN3_12642 ###
 partial, missing C-terminus
###Gene_Info_Comments GLEAN3_12644 ###
 extra N-terminus
###Gene_Info_Comments GLEAN3_12659 ###
 missing stretch in middle
###Gene_Info_Comments GLEAN3_12695 ###
 missing N-terminus
###Gene_Info_Comments GLEAN3_12698 ###
 missing stretch in middle
###Gene_Info_Comments GLEAN3_12790 ###
 partial, missing N-terminus
###Gene_Info_Comments GLEAN3_12817 ###
 missing stretch in middle
###Gene_Info_Comments GLEAN3_13130 ###
 partial, missing N-terminus
###Gene_Info_Comments GLEAN3_13232 ###
 partial, missing N-terminus
###Gene_Info_Comments GLEAN3_13283 ###
 partial, missing N-terminus
###Gene_Info_Comments GLEAN3_13353 ###
 missing stretch in middle
###Gene_Info_Comments GLEAN3_13586 ###
 partial, missing C-terminus
###Gene_Info_Comments GLEAN3_13757 ###
 partial, missing N-terminus
###Gene_Info_Comments GLEAN3_13853 ###
 extra stretch in middle
###Gene_Info_Comments GLEAN3_13906 ###
 missing stretches in middle
###Gene_Info_Comments GLEAN3_13936 ###
 missing stretch, extra stretch in middle
###Gene_Info_Comments GLEAN3_13966 ###
 missing stretch in middle
###Gene_Info_Comments GLEAN3_14007 ###
 partial, missing N-terminus
###Gene_Info_Comments GLEAN3_14019 ###
 partial, missing N-terminus, missing stretch in middle
###Gene_Info_Comments GLEAN3_14062 ###
 missing N-terminus, extra stretch in middle
###Gene_Info_Comments GLEAN3_14176 ###
 missing N-terminus
###Gene_Info_Comments GLEAN3_14341 ###
 partial, missing N- and C-terminus
###Gene_Info_Comments GLEAN3_14342 ###
 partial, missing N-terminus
###Gene_Info_Comments GLEAN3_14372 ###
 partial, missing N-terminus
###Gene_Info_Comments GLEAN3_14460 ###
 partial, missing C-terminus
###Gene_Info_Comments GLEAN3_14504 ###
 partial, missing N-terminus;
###Gene_Info_Comments GLEAN3_14595 ###
 partial, missing N-terminus
###Gene_Info_Comments GLEAN3_14653 ###
 missing central stretch
###Gene_Info_Comments GLEAN3_14694 ###
 partial, missing N-terminus
###Gene_Info_Comments GLEAN3_14710 ###
 partial, missing N-terminus and a stretch in middle
###Gene_Info_Comments GLEAN3_14718 ###
 missing N-terminus and central stretch
###Gene_Info_Comments GLEAN3_14939 ###
 partial, missing C-terminus
###Gene_Info_Comments GLEAN3_15089 ###
 partial, missing N-terminus
###Gene_Info_Comments GLEAN3_15155 ###
 partial, missing C-terminus and stretch in middle
###Gene_Info_Comments GLEAN3_15257 ###
 partial, missing C-terminus
###Gene_Info_Comments GLEAN3_15283 ###
 partial, missing N-terminus
###Gene_Info_Comments GLEAN3_15310 ###
 missing C-terminus
###Gene_Info_Comments GLEAN3_15323 ###
 missing N-terminus
###Gene_Info_Comments GLEAN3_15348 ###
 missing N-terminus and stretch in middle
###Gene_Info_Comments GLEAN3_15372 ###
 missing N-terminus
###Gene_Info_Comments GLEAN3_15485 ###
 missing N-terminus
###Gene_Info_Comments GLEAN3_15486 ###
 missing N-terminus
###Gene_Info_Comments GLEAN3_15507 ###
 missing N-terminus residues
###Gene_Info_Comments GLEAN3_15737 ###
 extra N-terminus
###Gene_Info_Comments GLEAN3_15738 ###
 extra N-terminus residues
###Gene_Info_Comments GLEAN3_15770 ###
 missing C-terminus
###Gene_Info_Comments GLEAN3_15842 ###
 missing N- and C-terminus
###Gene_Info_Comments GLEAN3_15851 ###
 missing C-terminus
###Gene_Info_Comments GLEAN3_15894 ###
 missing C-terminus
###Gene_Info_Comments GLEAN3_15966 ###
 missing C-terminus
###Gene_Info_Comments GLEAN3_15988 ###
 missing C-terminus
###Gene_Info_Comments GLEAN3_16076 ###
 missing C-terminus; extra N-terminus residues
###Gene_Info_Comments GLEAN3_16082 ###
 partial, missing N-terminus
###Gene_Info_Comments GLEAN3_16205 ###
 missing N-terminus
###Gene_Info_Comments GLEAN3_16277 ###
 missing N-terminus
###Gene_Info_Comments GLEAN3_16344 ###
 missing N-terminus residues
###Gene_Info_Comments GLEAN3_16345 ###
 missing N-terminus residues
###Gene_Info_Comments GLEAN3_16350 ###
 missing N-terminus residues
###Gene_Info_Comments GLEAN3_16377 ###
 missing N-terminus residues
###Gene_Info_Comments GLEAN3_16383 ###
 missing N-terminus
###Gene_Info_Comments GLEAN3_16494 ###
 partial, missing N-terminus
###Gene_Info_Comments GLEAN3_16520 ###
 missing N-terminus
###Gene_Info_Comments GLEAN3_16651 ###
 missing N-terminus
###Gene_Info_Comments GLEAN3_16702 ###
 missing C-terminus
###Gene_Info_Comments GLEAN3_16825 ###
 missing C-terminus
###Gene_Info_Comments GLEAN3_16826 ###
 missing C-terminus
###Gene_Info_Comments GLEAN3_16831 ###
 missing N-terminus
###Gene_Info_Comments GLEAN3_16860 ###
 missing stretch in middle
###Gene_Info_Comments GLEAN3_16882 ###
 missing stretch in middle
###Gene_Info_Comments GLEAN3_17058 ###
 missing N-terminus
###Gene_Info_Comments GLEAN3_17086 ###
 missing N-terminus
###Gene_Info_Comments GLEAN3_17099 ###
 missing C-terminus
###Gene_Info_Comments GLEAN3_17257 ###
 missing N-terminus
###Gene_Info_Comments GLEAN3_17403 ###
 missing C-terminus
###Gene_Info_Comments GLEAN3_17417 ###
 missing C-terminus
###Gene_Info_Comments GLEAN3_17478 ###
 missing N- and C-terminus; very similar to DAG kinase zeta form
###Gene_Info_Comments GLEAN3_17538 ###
 missing N-terminus
###Gene_Info_Comments GLEAN3_17573 ###
 missing C-terminus
###Gene_Info_Comments GLEAN3_17585 ###
 missing C-terminus
###Gene_Info_Comments GLEAN3_17661 ###
 missing N-terminus
###Gene_Info_Comments GLEAN3_17759 ###
 missing N-terminus
###Gene_Info_Comments GLEAN3_17992 ###
 missing N-terminus
###Gene_Info_Comments GLEAN3_18109 ###
 missing N-terminus
###Gene_Info_Comments GLEAN3_18110 ###
 missing N- and C-terminus
###Gene_Info_Comments GLEAN3_18127 ###
 extra N-terminus, missing C-terminus
###Gene_Info_Comments GLEAN3_18177 ###
 missing N-terminus
###Gene_Info_Comments GLEAN3_18270 ###
 missing N-terminus, missing stretch in middle
###Gene_Info_Comments GLEAN3_18322 ###
 missing N-terminus residues
###Gene_Info_Comments GLEAN3_18421 ###
 missing N-terminus
###Gene_Info_Comments GLEAN3_18431 ###
 missing C-terminus
###Gene_Info_Comments GLEAN3_18433 ###
 missing N-terminus
###Gene_Info_Comments GLEAN3_18446 ###
 missing C-terminus
###Gene_Info_Comments GLEAN3_18466 ###
 missing N-terminus residues
###Gene_Info_Comments GLEAN3_18527 ###
 missing C-terminus
###Gene_Info_Comments GLEAN3_18618 ###
 missing C-terminus
###Gene_Info_Comments GLEAN3_18620 ###
 missing N- and C-terminus
###Gene_Info_Comments GLEAN3_18748 ###
 missing N-terminus
###Gene_Info_Comments GLEAN3_18758 ###
 missing C-terminus
###Gene_Info_Comments GLEAN3_18907 ###
 extra N-terminus
###Gene_Info_Comments GLEAN3_19001 ###
 missing N-terminus
###Gene_Info_Comments GLEAN3_19016 ###
 missing N-terminus
###Gene_Info_Comments GLEAN3_19288 ###
 partial, missing N-terminus
###Gene_Info_Comments GLEAN3_19357 ###
 missing stretch in middle
###Gene_Info_Comments GLEAN3_19398 ###
 missing stretch in middle
###Gene_Info_Comments GLEAN3_19420 ###
 missing N-terminus
###Gene_Info_Comments GLEAN3_19421 ###
 missing N-terminus
###Gene_Info_Comments GLEAN3_19428 ###
 missing N-terminus
###Gene_Info_Comments GLEAN3_19468 ###
 partial, missing C-terminus
###Gene_Info_Comments GLEAN3_19471 ###
 missing C-terminus
###Gene_Info_Comments GLEAN3_19505 ###
 missing N-terminus
###Gene_Info_Comments GLEAN3_19692 ###
 missing N-terminus
###Gene_Info_Comments GLEAN3_19744 ###
 missing N-terminus
###Gene_Info_Comments GLEAN3_19914 ###
 missing C-terminus
###Gene_Info_Comments GLEAN3_22002 ###
 extra stretch in middle
###Gene_Info_Comments GLEAN3_22028 ###
 missing N- and C-terminus
###Gene_Info_Comments GLEAN3_22045 ###
 missing C-terminus
###Gene_Info_Comments GLEAN3_22129 ###
 missing some N-terminus residues
###Gene_Info_Comments GLEAN3_22252 ###
 missing N-terminus
###Gene_Info_Comments GLEAN3_22260 ###
 missing short stretch in middle
###Gene_Info_Comments GLEAN3_22263 ###
 missing N-terminus
###Gene_Info_Comments GLEAN3_22326 ###
 missing stretch in middle
###Gene_Info_Comments GLEAN3_22378 ###
 missing C-terminus
###Gene_Info_Comments GLEAN3_22397 ###
 missing C-terminus
###Gene_Info_Comments GLEAN3_22399 ###
 extra stretch in middle
###Gene_Info_Comments GLEAN3_22597 ###
 missing stretches in middle
###Gene_Info_Comments GLEAN3_22651 ###
 missing N-terminus
###Gene_Info_Comments GLEAN3_22750 ###
 missing C-terminus, extra stretch in middle
###Gene_Info_Comments GLEAN3_22807 ###
 missing stretch in middle
###Gene_Info_Comments GLEAN3_22949 ###
 missing N-terminus
###Gene_Info_Comments GLEAN3_22986 ###
 missing N-terminus
###Gene_Info_Comments GLEAN3_23118 ###
 missing N-terminus
###Gene_Info_Comments GLEAN3_23270 ###
 missing N-terminus, missing stretch in middle
###Gene_Info_Comments GLEAN3_23332 ###
 missing N-terminus
###Gene_Info_Comments GLEAN3_23511 ###
 missing stretch
###Gene_Info_Comments GLEAN3_23691 ###
 missing N-terminus
###Gene_Info_Comments GLEAN3_23764 ###
 missing N-terminus
###Gene_Info_Comments GLEAN3_23829 ###
 missing C-terminus
###Gene_Info_Comments GLEAN3_23834 ###
 missing stretches
###Gene_Info_Comments GLEAN3_23842 ###
 extra stretch in middle
###Gene_Info_Comments GLEAN3_23859 ###
 missing C-terminus
###Gene_Info_Comments GLEAN3_23890 ###
 missing N-terminus
###Gene_Info_Comments GLEAN3_23942 ###
 missing N-terminus
###Gene_Info_Comments GLEAN3_20026 ###
 missing C-terminus
###Gene_Info_Comments GLEAN3_20089 ###
 missing N-terminus
###Gene_Info_Comments GLEAN3_20200 ###
 missing some N-terminus residues
###Gene_Info_Comments GLEAN3_20211 ###
 missing some C-terminus residues
###Gene_Info_Comments GLEAN3_20302 ###
 missing C-terminus
###Gene_Info_Comments GLEAN3_20368 ###
 missing N- and C-terminus
###Gene_Info_Comments GLEAN3_20397 ###
 missing C-terminus
###Gene_Info_Comments GLEAN3_20402 ###
 missing N-terminus
###Gene_Info_Comments GLEAN3_20445 ###
 missing C-terminus
###Gene_Info_Comments GLEAN3_20497 ###
 missing N-terminus
###Gene_Info_Comments GLEAN3_20566 ###
 missing N-terminus residues
###Gene_Info_Comments GLEAN3_20576 ###
 missing C-terminus
###Gene_Info_Comments GLEAN3_20707 ###
 missing N- and C-terminus
###Gene_Info_Comments GLEAN3_20739 ###
 missing N-terminus
###Gene_Info_Comments GLEAN3_20808 ###
 missing C-terminus
###Gene_Info_Comments GLEAN3_20881 ###
 missing N-terminus
###Gene_Info_Comments GLEAN3_20886 ###
 missing stretch in middle
###Gene_Info_Comments GLEAN3_20970 ###
 missing N-terminus
###Gene_Info_Comments GLEAN3_21058 ###
 missing N-terminus
###Gene_Info_Comments GLEAN3_21355 ###
 missing N-terminus
###Gene_Info_Comments GLEAN3_21387 ###
 missing C-terminus
###Gene_Info_Comments GLEAN3_21465 ###
 missing C-terminus
###Gene_Info_Comments GLEAN3_21628 ###
 missing stretch in middle
###Gene_Info_Comments GLEAN3_21658 ###
 missing C-terminus
###Gene_Info_Comments GLEAN3_21772 ###
 missing N-terminus
###Gene_Info_Comments GLEAN3_21788 ###
 missing C-terminus
###Gene_Info_Comments GLEAN3_21802 ###
 missing N-terminus
###Gene_Info_Comments GLEAN3_21854 ###
 extra N-terminus
###Gene_Info_Comments GLEAN3_21867 ###
 missing N- and C-terminus
###Gene_Info_Comments GLEAN3_21895 ###
 missing C-terminus
###Gene_Info_Comments GLEAN3_21933 ###
 missing N- and C-terminus
###Gene_Info_Comments GLEAN3_21934 ###
 missing C-terminus
###Gene_Info_Comments GLEAN3_21979 ###
 missing N-terminus
###Gene_Info_Comments GLEAN3_24016 ###
 missing N-terminus residues
###Gene_Info_Comments GLEAN3_24174 ###
 missing N-terminus
###Gene_Info_Comments GLEAN3_24224 ###
 missing N-terminus
###Gene_Info_Comments GLEAN3_24261 ###
 missing C-terminus
###Gene_Info_Comments GLEAN3_24267 ###
 missing C-terminus, extra stretch in middle
###Gene_Info_Comments GLEAN3_24383 ###
 missing N-terminus
###Gene_Info_Comments GLEAN3_24483 ###
 missing N-terminus
###Gene_Info_Comments GLEAN3_24535 ###
 missing C-terminus
###Gene_Info_Comments GLEAN3_24622 ###
 missing N-terminus
###Gene_Info_Comments GLEAN3_24755 ###
 extra N-terminus; missing C-terminus
###Gene_Info_Comments GLEAN3_24775 ###
 missing C-terminus
###Gene_Info_Comments GLEAN3_24846 ###
 missing N-terminus
###Gene_Info_Comments GLEAN3_24873 ###
 missing N-terminus
###Gene_Info_Comments GLEAN3_24970 ###
 missing N-terminus, missing stretch in middle
###Gene_Info_Comments GLEAN3_25092 ###
 missing N-terminus
###Gene_Info_Comments GLEAN3_25100 ###
 missing C-terminus
###Gene_Info_Comments GLEAN3_25411 ###
 missing N-terminus
###Gene_Info_Comments GLEAN3_25567 ###
 missing N-terminus
###Gene_Info_Comments GLEAN3_25661 ###
 missing C-terminus
###Gene_Info_Comments GLEAN3_25697 ###
 missing N-terminus
###Gene_Info_Comments GLEAN3_25702 ###
 missing N-terminus
###Gene_Info_Comments GLEAN3_25770 ###
 missing N-terminus
###Gene_Info_Comments GLEAN3_25858 ###
 missing C-terminus
###Gene_Info_Comments GLEAN3_25989 ###
 missing N-terminus, missing short stretch in middle
###Gene_Info_Comments GLEAN3_26127 ###
 missing C-terminus
###Gene_Info_Comments GLEAN3_26212 ###
 missing C-terminus
###Gene_Info_Comments GLEAN3_26291 ###
 missing some C-terminus residues
###Gene_Info_Comments GLEAN3_26437 ###
 missing N-terminus
###Gene_Info_Comments GLEAN3_26552 ###
 missing N-terminus
###Gene_Info_Comments GLEAN3_26556 ###
 missing N-terminus
###Gene_Info_Comments GLEAN3_26625 ###
 missing C-terminus
###Gene_Info_Comments GLEAN3_26639 ###
 missing some N-terminus residues
###Gene_Info_Comments GLEAN3_26702 ###
 missing C-terminus
###Gene_Info_Comments GLEAN3_26737 ###
 missing C-terminus
###Gene_Info_Comments GLEAN3_26881 ###
 extra N-terminus residues
###Gene_Info_Comments GLEAN3_27010 ###
 missing stretches in middle
###Gene_Info_Comments GLEAN3_27078 ###
 missing some N-terminus residues
###Gene_Info_Comments GLEAN3_27209 ###
 partial, missing C-terminus
###Gene_Info_Comments GLEAN3_27304 ###
 missing some N-terminus residues
###Gene_Info_Comments GLEAN3_27344 ###
 missing some N-terminus residues
###Gene_Info_Comments GLEAN3_27388 ###
 partial, missing N-terminus
###Gene_Info_Comments GLEAN3_27650 ###
 partial, missing C-terminus
###Gene_Info_Comments GLEAN3_27669 ###
 missing some N-terminus residues
###Gene_Info_Comments GLEAN3_28001 ###
 missing N-terminus residues
###Gene_Info_Comments GLEAN3_28105 ###
 missing stretch
###Gene_Info_Comments GLEAN3_28139 ###
 extra N-terminus
###Gene_Info_Comments GLEAN3_28141 ###
 missing some N-terminus residues
###Gene_Info_Comments GLEAN3_28167 ###
 partial, missing C-terminus
###Gene_Info_Comments GLEAN3_28178 ###
 missing N-terminus
###Gene_Info_Comments GLEAN3_28238 ###
 partial, missing N-terminus
###Gene_Info_Comments GLEAN3_28504 ###
 missing N-terminus, missing stretch in middle
###Gene_Info_Comments GLEAN3_28572 ###
 partial, missing N-terminus
###Gene_Info_Comments GLEAN3_28573 ###
 partial, missing N-terminus
###Gene_Info_Comments GLEAN3_28728 ###
 extra N-terminus residues
###Gene_Info_Comments GLEAN3_03616 ###
 partial, missing N-terminus
###Gene_Info_Comments GLEAN3_03731 ###
 partial, missing N-terminus
###Gene_Info_Comments GLEAN3_04870 ###
 extra N-terminus
###Gene_Info_Comments GLEAN3_08806 ###
 missing N-terminus half
###Gene_Info_Comments GLEAN3_10539 ###
 missing N-terminus half
###Gene_Info_Comments GLEAN3_13979 ###
 missing N-terminus
###Gene_Info_Comments GLEAN3_15618 ###
 missing most of the C-terminus
###Gene_Info_Comments GLEAN3_18531 ###
 missing N-terminus half
###Gene_Info_Comments GLEAN3_20972 ###
 missing some N-terminus residues
###Gene_Info_Comments GLEAN3_23656 ###
 missing N-terminus half
###Gene_Info_Comments GLEAN3_24406 ###
 missing N-terminus half
###Gene_Info_Comments GLEAN3_24714 ###
 missing some N-terminus residues
###Gene_Info_Comments GLEAN3_24790 ###
 missing N-terminus half
###Gene_Info_Comments GLEAN3_26832 ###
 missing most of the C-terminus
###Gene_Info_Comments GLEAN3_27484 ###
 missing N-terminus half
###Gene_Info_Comments GLEAN3_27859 ###
 missing N-terminus half
###Gene_Info_Comments GLEAN3_00068 ###
 partial, missing N-terminus
###Gene_Info_Comments GLEAN3_00334 ###
 extra stretch in middle
###Gene_Info_Comments GLEAN3_00967 ###
 partial, missing N-terminus
###Gene_Info_Comments GLEAN3_01285 ###
 missing stretch in middle
###Gene_Info_Comments GLEAN3_01563 ###
 partial, missing C-terminus
###Gene_Info_Comments GLEAN3_01956 ###
 extra C-terminus
###Gene_Info_Comments GLEAN3_02346 ###
 partial, missing N-terminus
###Gene_Info_Comments GLEAN3_02465 ###
 partial, missing N-terminus
###Gene_Info_Comments GLEAN3_02709 ###
 missing N-terminus
###Gene_Info_Comments GLEAN3_03103 ###
 extra N-terminus
###Gene_Info_Comments GLEAN3_03394 ###
 partial, missing N-terminus
###Gene_Info_Comments GLEAN3_03812 ###
 partial, missing N-terminus
###Gene_Info_Comments GLEAN3_04751 ###
 partial, missing N-terminus
###Gene_Info_Comments GLEAN3_04937 ###
 extra N-terminus
###Gene_Info_Comments GLEAN3_04990 ###
 partial, missing N-terminus
###Gene_Info_Comments GLEAN3_05195 ###
 partial, missing N-terminus
###Gene_Info_Comments GLEAN3_05196 ###
 missing stretches in middle
###Gene_Info_Comments GLEAN3_05590 ###
 partial, missing C-terminus
###Gene_Info_Comments GLEAN3_06347 ###
 extra stretch in middle
###Gene_Info_Comments GLEAN3_06438 ###
 partial, missing N-terminus
###Gene_Info_Comments GLEAN3_06617 ###
 partial, missing N- and C-terminus
###Gene_Info_Comments GLEAN3_09308 ###
 partial, missing N-terminus and stretches in middle
###Gene_Info_Comments GLEAN3_11046 ###
 extra N-terminus
###Gene_Info_Comments GLEAN3_11345 ###
 missing stretch in middle
###Gene_Info_Comments GLEAN3_12046 ###
 missing some N-terminus residues
###Gene_Info_Comments GLEAN3_12806 ###
 extra stretch in middle, missing another stretch
###Gene_Info_Comments GLEAN3_13454 ###
 missing stretches in middle
###Gene_Info_Comments GLEAN3_22704 ###
 partial, missing N- and C-terminus
###Gene_Info_Comments GLEAN3_27953 ###
 partial, missing C-terminus
###Gene_Info_Comments GLEAN3_11059 ###
 extra N-terminus
###Gene_Info_Comments GLEAN3_11934 ###
 extra N-terminus
###Gene_Info_Comments GLEAN3_12498 ###
 missing N-terminus
###Gene_Info_Comments GLEAN3_12676 ###
 missing C-terminus
###Gene_Info_Comments GLEAN3_12814 ###
 partial, missing N-terminus
###Gene_Info_Comments GLEAN3_13310 ###
 missing stretch in middle, missing C-terminus
###Gene_Info_Comments GLEAN3_13361 ###
 partial, missing N-terminus
###Gene_Info_Comments GLEAN3_13539 ###
 missing N-terminus
###Gene_Info_Comments GLEAN3_13587 ###
 partial, missing C-terminus
###Gene_Info_Comments GLEAN3_13735 ###
 extra stretch in middle
###Gene_Info_Comments GLEAN3_14024 ###
 missing C-terminus/central stretch
###Gene_Info_Comments GLEAN3_14152 ###
 missing C-terminus
###Gene_Info_Comments GLEAN3_14214 ###
 partial, missing N- and C-terminus
###Gene_Info_Comments GLEAN3_14338 ###
 partial, missing C-terminus
###Gene_Info_Comments GLEAN3_14519 ###
 missing C-terminus
###Gene_Info_Comments GLEAN3_14596 ###
 extra C-terminus
###Gene_Info_Comments GLEAN3_14615 ###
 missing N-terminus
###Gene_Info_Comments GLEAN3_14937 ###
 missing N-terminus, extra stretch in middle
###Gene_Info_Comments GLEAN3_15162 ###
 missing C-terminus
###Gene_Info_Comments GLEAN3_15314 ###
 partial, missing N- and C-terminus
###Gene_Info_Comments GLEAN3_15853 ###
 extra stretch in middle
###Gene_Info_Comments GLEAN3_15946 ###
 missing C-terminus residues
###Gene_Info_Comments GLEAN3_16164 ###
 missing C-terminus and  stretch in middle
###Gene_Info_Comments GLEAN3_16679 ###
 missing N-terminus
###Gene_Info_Comments GLEAN3_16853 ###
 missing N-terminus, missing stretch in middle
###Gene_Info_Comments GLEAN3_16947 ###
 missing C-terminus
###Gene_Info_Comments GLEAN3_17326 ###
 extra C-terminus
###Gene_Info_Comments GLEAN3_17681 ###
 extra stretch in middle
###Gene_Info_Comments GLEAN3_06711 ###
 missing C-terminus
###Gene_Info_Comments GLEAN3_07009 ###
 missing C-terminus, extra stretch in middle
###Gene_Info_Comments GLEAN3_07217 ###
 partial, missing C-terminus
###Gene_Info_Comments GLEAN3_07403 ###
 missing C-terminus
###Gene_Info_Comments GLEAN3_08057 ###
 extra C-terminus
###Gene_Info_Comments GLEAN3_08192 ###
 extra C-terminus
###Gene_Info_Comments GLEAN3_08230 ###
 partial, missing N- and C-terminus
###Gene_Info_Comments GLEAN3_08397 ###
 missing N- and C-terminus
###Gene_Info_Comments GLEAN3_08467 ###
 extra C-terminus
###Gene_Info_Comments GLEAN3_08631 ###
 partial, missing N- and C-terminus
###Gene_Info_Comments GLEAN3_08633 ###
 missing N-terminus, extra C-terminus
###Gene_Info_Comments GLEAN3_08869 ###
 partial, missing N- and C-terminus
###Gene_Info_Comments GLEAN3_08934 ###
 partial, missing N- and C-terminus
###Gene_Info_Comments GLEAN3_08937 ###
 missing C-terminus
###Gene_Info_Comments GLEAN3_09014 ###
 partial, missing C-terminus
###Gene_Info_Comments GLEAN3_09204 ###
 partial, missing C-terminus
###Gene_Info_Comments GLEAN3_09336 ###
 extra C-terminus
###Gene_Info_Comments GLEAN3_09416 ###
 missing C-terminus
###Gene_Info_Comments GLEAN3_09458 ###
 partial, missing C-terminus
###Gene_Info_Comments GLEAN3_09494 ###
 missing N-terminus
###Gene_Info_Comments GLEAN3_09871 ###
 extra N-terminus
###Gene_Info_Comments GLEAN3_14156 ###
 missing N- and C-terminus, extra stretches in middle
###Gene_Info_Comments GLEAN3_14280 ###
 partial, missing N-terminus half
###Gene_Info_Comments GLEAN3_14789 ###
 missing N-terminus
###Gene_Info_Comments GLEAN3_15067 ###
 extra N-terminus
###Gene_Info_Comments GLEAN3_16878 ###
 missing some N-terminus residues
###Gene_Info_Comments GLEAN3_16897 ###
 partial, missing N-terminus
###Gene_Info_Comments GLEAN3_16916 ###
 partial, missing C-terminus
###Gene_Info_Comments GLEAN3_17350 ###
 extra C-terminus
###Gene_Info_Comments GLEAN3_17736 ###
 extra N-terminus residues
###Gene_Info_Comments GLEAN3_17988 ###
 partial, missing N-terminus half
###Gene_Info_Comments GLEAN3_18396 ###
 missing N-terminus
###Gene_Info_Comments GLEAN3_18653 ###
 extra N-terminus
###Gene_Info_Comments GLEAN3_19095 ###
 partial, missing N-terminus
###Gene_Info_Comments GLEAN3_19683 ###
 missing steches in middle
###Gene_Info_Comments GLEAN3_20426 ###
 partial, missing N-terminus half
###Gene_Info_Comments GLEAN3_20467 ###
 partial, missing N- and C-terminus
###Gene_Info_Comments GLEAN3_20639 ###
 partial, missing N-terminus
###Gene_Info_Comments GLEAN3_20860 ###
 missing N- and C-terminus
###Gene_Info_Comments GLEAN3_21110 ###
 missing N-terminus
###Gene_Info_Comments GLEAN3_21455 ###
 missing N-terminus
###Gene_Info_Comments GLEAN3_21878 ###
 extra stretch in middle
###Gene_Info_Comments GLEAN3_23232 ###
 extra stretch in middle, missing C-terminus
###Gene_Info_Comments GLEAN3_23320 ###
 missing C-terminus
###Gene_Info_Comments GLEAN3_23426 ###
 missing N-terminus
###Gene_Info_Comments GLEAN3_23457 ###
 partial, missing N-terminus half
###Gene_Info_Comments GLEAN3_23702 ###
 partial, missing C-terminus
###Gene_Info_Comments GLEAN3_23760 ###
 partial, missing N-terminus half
###Gene_Info_Comments GLEAN3_23846 ###
 partial, missing C-terminus half
###Gene_Info_Comments GLEAN3_25439 ###
 missing C-terminus
###Gene_Info_Comments GLEAN3_25470 ###
 missing N-terminus
###Gene_Info_Comments GLEAN3_25929 ###
 missing N-terminus
###Gene_Info_Comments GLEAN3_25934 ###
 partial, missing N-terminus helf
###Gene_Info_Comments GLEAN3_26466 ###
 partial, missing N- and C-terminus
###Gene_Info_Comments GLEAN3_26857 ###
 missing C-terminus
###Gene_Info_Comments GLEAN3_27797 ###
 missing central stretch
###Gene_Info_Comments GLEAN3_27850 ###
 extra stretch in middle
###Gene_Info_Comments GLEAN3_28290 ###
 extra C-terminus
###Gene_Info_Comments GLEAN3_28310 ###
 extra N-terminus
###Gene_Info_Comments GLEAN3_28802 ###
 missing C-terminus
###Gene_Info_Comments GLEAN3_03330 ###
 partial, missing middle stretch and C-terminus
###Gene_Info_Comments GLEAN3_03559 ###
 partial, missing C-terminus
###Gene_Info_Comments GLEAN3_03859 ###
 partial, missing C-terminus
###Gene_Info_Comments GLEAN3_04907 ###
 partial, missing C-terminus
###Gene_Info_Comments GLEAN3_05150 ###
 partial, missing N-terminus
###Gene_Info_Comments GLEAN3_06125 ###
 partial, missing C-terminus
###Gene_Info_Comments GLEAN3_06705 ###
 partial, missing N-terminus
###Gene_Info_Comments GLEAN3_06845 ###
 partial, missing C-terminus
###Gene_Info_Comments GLEAN3_07043 ###
 missing N-terminus
###Gene_Info_Comments GLEAN3_18060 ###
 extra stretch on C-terminus
###Gene_Info_Comments GLEAN3_19510 ###
 extra N-terminus half
###Gene_Info_Comments GLEAN3_20937 ###
 missing C-terminus
###Gene_Info_Comments GLEAN3_21169 ###
 missing C-terminus
###Gene_Info_Comments GLEAN3_24001 ###
 missing N-terminus
###Gene_Info_Comments GLEAN3_24891 ###
 extra N-terminus
###Gene_Info_Comments GLEAN3_25779 ###
 missing N-terminus, extra C-terminus
###Gene_Info_Comments GLEAN3_26067 ###
 missing N-terminus
###Gene_Info_Comments GLEAN3_26271 ###
 extra N-terminus
###Gene_Info_Comments GLEAN3_26777 ###
 extra N-terminus half
###Gene_Info_Comments GLEAN3_26817 ###
 missing N-terminus half
###Gene_Info_Comments GLEAN3_26904 ###
 extra strectch in middle
###Gene_Info_Comments GLEAN3_26913 ###
 missing C-terminus half
###Gene_Info_Comments GLEAN3_27400 ###
 extra C-terminus
###Gene_Info_Comments GLEAN3_27510 ###
 missing N-terminus half, extra C-terminus
###Gene_Info_Comments GLEAN3_27914 ###
 missing N-terminus half
###Gene_Info_Comments GLEAN3_28042 ###
 missing C-terminus half
###Gene_Info_Comments GLEAN3_28587 ###
 missing C-terminus
###Gene_Info_Comments GLEAN3_28628 ###
 missing C-terminus, extra N-terminus
###Gene_Info_Comments GLEAN3_28876 ###
 missing stretch in middle
###Gene_Info_Comments GLEAN3_00862 ###
 partial, missing N-terminus
###Gene_Info_Comments GLEAN3_01188 ###
 missing N-terminus
###Gene_Info_Comments GLEAN3_01363 ###
 partial, missing C-terminus
###Gene_Info_Comments GLEAN3_01561 ###
 partial, missing N- and C-terminus
###Gene_Info_Comments GLEAN3_02713 ###
 partial, missing N-terminus
###Gene_Info_Comments GLEAN3_03022 ###
 missing C-terminus
###Gene_Info_Comments GLEAN3_03273 ###
 partial, missing N-terminus
###Gene_Info_Comments GLEAN3_03429 ###
 partial, missing N-terminus
###Gene_Info_Comments GLEAN3_04380 ###
 partial, missing N-terminus
###Gene_Info_Comments GLEAN3_04666 ###
 missing N-terminus
###Gene_Info_Comments GLEAN3_05157 ###
 partial, missing N-terminus
###Gene_Info_Comments GLEAN3_06877 ###
 partial, missing N-terminus
###Gene_Info_Comments GLEAN3_11374 ###
 partial, missing N-terminus
###Gene_Info_Comments GLEAN3_12901 ###
 partial, missing N-terminus
###Gene_Info_Comments GLEAN3_15291 ###
 partial, missing N-terminus
###Gene_Info_Comments GLEAN3_15488 ###
 partial, missing N-terminus
###Gene_Info_Comments GLEAN3_16048 ###
 partial, missing N- and C-terminus
###Gene_Info_Comments GLEAN3_17522 ###
 partial, missing N-terminus
###Gene_Info_Comments GLEAN3_19648 ###
 partial, missing N-terminus
###Gene_Info_Comments GLEAN3_19864 ###
 partial, missing N- and C-terminus
###Gene_Info_Comments GLEAN3_20156 ###
 partial, missing N-terminus
###Gene_Info_Comments GLEAN3_20751 ###
 partial, missing N- and C-terminus
###Gene_Info_Comments GLEAN3_21188 ###
 partial, missing N-terminus
###Gene_Info_Comments GLEAN3_21571 ###
 partial, missing N-terminus
###Gene_Info_Comments GLEAN3_21685 ###
 missing N-terminus
###Gene_Info_Comments GLEAN3_23801 ###
 partial, missing N-terminus
###Gene_Info_Comments GLEAN3_24740 ###
 partial, missing N- and C-terminus
###Gene_Info_Comments GLEAN3_24922 ###
 partial, missing N-terminus
###Gene_Info_Comments GLEAN3_27274 ###
 partial, missing N-terminus
###Gene_Info_Comments GLEAN3_27775 ###
 partial, missing N-terminus
###Gene_Info_Comments GLEAN3_27849 ###
 partial, missing N-terminus
###Gene_Info_Comments GLEAN3_28164 ###
 partial, missing C-terminus
###Gene_Info_Comments GLEAN3_28496 ###
 partial, missing N-terminus
###Gene_Info_Comments GLEAN3_10026 ###
 partial, missing C-terminus
###Gene_Info_Comments GLEAN3_10208 ###
 partial, missing N-terminus
###Gene_Info_Comments GLEAN3_10211 ###
 extra stretches
###Gene_Info_Comments GLEAN3_10270 ###
 partial, missing C-terminus
###Gene_Info_Comments GLEAN3_10310 ###
 partial, missing N-terminus
###Gene_Info_Comments GLEAN3_10479 ###
 partial, missing N-terminus
###Gene_Info_Comments GLEAN3_10549 ###
 partial, missing C-terminus
###Gene_Info_Comments GLEAN3_10644 ###
 partial, missing C-terminus
###Gene_Info_Comments GLEAN3_10645 ###
 partial, missing N-terminus
###Gene_Info_Comments GLEAN3_10646 ###
 partial, missing N- and C-terminus
###Gene_Info_Comments GLEAN3_10683 ###
 partial, missing C-terminus
###Gene_Info_Comments GLEAN3_10686 ###
 extra N- and C-terminus
###Gene_Info_Comments GLEAN3_10778 ###
 extra N-terminus
###Gene_Info_Comments GLEAN3_10785 ###
 extra steches in middle
###Gene_Info_Comments GLEAN3_10918 ###
 missing central regions
###Gene_Info_Comments GLEAN3_11036 ###
 partial, missing N- and C-terminus
###Gene_Info_Comments GLEAN3_11133 ###
 extra residues on N-terminus
###Gene_Info_Comments GLEAN3_11187 ###
 partial, missing stretch in middle, missing C-terminus
###Gene_Info_Comments GLEAN3_11213 ###
 partial, missing N-terminus
###Gene_Info_Comments GLEAN3_11438 ###
 partial, missing N-terminus
###Gene_Info_Comments GLEAN3_11449 ###
 partial, missing N-terminus
###Gene_Info_Comments GLEAN3_11499 ###
 extra at N-terminus; extra stretch in middle
###Gene_Info_Comments GLEAN3_11604 ###
 extra N-terminus
###Gene_Info_Comments GLEAN3_11845 ###
 partial, missing N-terminus
###Gene_Info_Comments GLEAN3_11893 ###
 missing C-terminus residues
###Gene_Info_Comments GLEAN3_11978 ###
 partial, missing C-terminus
###Gene_Info_Comments GLEAN3_12289 ###
 extra N-terminus
###Gene_Info_Comments GLEAN3_12430 ###
 extra residues on C- and N-terminus
###Gene_Info_Comments GLEAN3_12473 ###
 partial, missing C-terminus
###Gene_Info_Comments GLEAN3_12675 ###
 missing N-terminus; extra C-terminus
###Gene_Info_Comments GLEAN3_12789 ###
 partial, missing C-terminus
###Gene_Info_Comments GLEAN3_12834 ###
 extra N-terminus residues
###Gene_Info_Comments GLEAN3_13010 ###
 missing stretches in between
###Gene_Info_Comments GLEAN3_13037 ###
 missing some N-terminus residues
###Gene_Info_Comments GLEAN3_13046 ###
 partial, missing N-terminus
###Gene_Info_Comments GLEAN3_13105 ###
 partial, missing N- and C-terminus
###Gene_Info_Comments GLEAN3_13242 ###
 extra N-terminus; extra residues in center
###Gene_Info_Comments GLEAN3_13338 ###
 extra N-terminus
###Gene_Info_Comments GLEAN3_13376 ###
 missing some N-terminus residues
###Gene_Info_Comments GLEAN3_13428 ###
 extra N-terminus residues
###Gene_Info_Comments GLEAN3_13440 ###
 partial, missing N-terminus
###Gene_Info_Comments GLEAN3_13463 ###
 partial, missing N-terminus and center stretch
###Gene_Info_Comments GLEAN3_13750 ###
 partial, missing N-terminus
###Gene_Info_Comments GLEAN3_13912 ###
 missing N-terminus
###Gene_Info_Comments GLEAN3_13942 ###
 partial, missing C-terminus
###Gene_Info_Comments GLEAN3_13943 ###
 partial, missing C-terminus
###Gene_Info_Comments GLEAN3_14025 ###
 extra residues in center
###Gene_Info_Comments GLEAN3_14135 ###
 extra N-terminus
###Gene_Info_Comments GLEAN3_14136 ###
 extra N-terminus
###Gene_Info_Comments GLEAN3_14251 ###
 partial, missing stretch in middle
###Gene_Info_Comments GLEAN3_14315 ###
 partial, missing N-terminus, matches on small C-terminus part
###Gene_Info_Comments GLEAN3_14406 ###
 partial, missing N-terminus
###Gene_Info_Comments GLEAN3_14507 ###
 partial, missing N-terminus
###Gene_Info_Comments GLEAN3_14569 ###
 partial, missing C-terminus
###Gene_Info_Comments GLEAN3_14874 ###
 partial, missing N-terminus
###Gene_Info_Comments GLEAN3_15048 ###
 partial, missing N-terminus
###Gene_Info_Comments GLEAN3_15062 ###
 extra N-terminus
###Gene_Info_Comments GLEAN3_15069 ###
 partial, missing C-terminus
###Gene_Info_Comments GLEAN3_15156 ###
 partial, missing N-terminus
###Gene_Info_Comments GLEAN3_15169 ###
 extra N-terminus
###Gene_Info_Comments GLEAN3_15343 ###
 missing N- and C-terminus
###Gene_Info_Comments GLEAN3_15380 ###
 partial, missing N- and C-terminus
###Gene_Info_Comments GLEAN3_15497 ###
 extra C-terminus
###Gene_Info_Comments GLEAN3_15568 ###
 partial, missing N-terminus and central stretch
###Gene_Info_Comments GLEAN3_15573 ###
 partial, missing N- and C-terminus
###Gene_Info_Comments GLEAN3_15586 ###
 partial, missing N-terminus
###Gene_Info_Comments GLEAN3_15781 ###
 missing C-terminus
###Gene_Info_Comments GLEAN3_15793 ###
 partial, missing central stretch
###Gene_Info_Comments GLEAN3_15814 ###
 partial, missing C-terminus
###Gene_Info_Comments GLEAN3_15837 ###
 partial, missing C-terminus
###Gene_Info_Comments GLEAN3_15963 ###
 partial, missing C-terminus
###Gene_Info_Comments GLEAN3_15981 ###
 missing N-terminus
###Gene_Info_Comments GLEAN3_15997 ###
 extra N-terminus
###Gene_Info_Comments GLEAN3_16152 ###
 partial, missing central region
###Gene_Info_Comments GLEAN3_16169 ###
 extra stretch in middle
###Gene_Info_Comments GLEAN3_16302 ###
 missing N-terminus, extra C-terminus
###Gene_Info_Comments GLEAN3_16324 ###
 missing N-terminus
###Gene_Info_Comments GLEAN3_16347 ###
 partial, missing C-terminus
###Gene_Info_Comments GLEAN3_16408 ###
 extra C-terminus
###Gene_Info_Comments GLEAN3_16410 ###
 extra N-terminus
###Gene_Info_Comments GLEAN3_16456 ###
 extra C-terminus
###Gene_Info_Comments GLEAN3_16569 ###
 extra N-terminus
###Gene_Info_Comments GLEAN3_16627 ###
 partial, missing C-terminus
###Gene_Info_Comments GLEAN3_16652 ###
 extra N-terminus
###Gene_Info_Comments GLEAN3_16713 ###
 missing C-terminus and stretch in middle
###Gene_Info_Comments GLEAN3_16747 ###
 partial, missing N-terminus and stretch in middle
###Gene_Info_Comments GLEAN3_16824 ###
 extra N- and C-terminus
###Gene_Info_Comments GLEAN3_16988 ###
 missing N-terminus
###Gene_Info_Comments GLEAN3_16996 ###
 missing N-terminus
###Gene_Info_Comments GLEAN3_17107 ###
 partial, missing N- and C-terminus
###Gene_Info_Comments GLEAN3_17212 ###
 extra stretch in middle
###Gene_Info_Comments GLEAN3_17281 ###
 partial, missing C-terminus
###Gene_Info_Comments GLEAN3_17347 ###
 missing N-terminus
###Gene_Info_Comments GLEAN3_17415 ###
 extra N-terminus
###Gene_Info_Comments GLEAN3_17467 ###
 extra N-terminus
###Gene_Info_Comments GLEAN3_17481 ###
 partial, missing N- and C-terminus
###Gene_Info_Comments GLEAN3_17553 ###
 partial, missing N- and C-terminus
###Gene_Info_Comments GLEAN3_17563 ###
 partial, missing N- and C-terminus
###Gene_Info_Comments GLEAN3_17662 ###
 partial, missing C-terminus
###Gene_Info_Comments GLEAN3_17695 ###
 partial, missing C-terminus and central stretch
###Gene_Info_Comments GLEAN3_17801 ###
 extra N-terminus
###Gene_Info_Comments GLEAN3_17836 ###
 partial, missing C-terminus
###Gene_Info_Comments GLEAN3_17838 ###
 partial, missing N- and C-terminus
###Gene_Info_Comments GLEAN3_17965 ###
 partial, missing C-terminus
###Gene_Info_Comments GLEAN3_17990 ###
 partial, missing N-terminus
###Gene_Info_Comments GLEAN3_18166 ###
 partial, missing C-terminus
###Gene_Info_Comments GLEAN3_18179 ###
 partial, missing C-terminus and stretch in middle
###Gene_Info_Comments GLEAN3_18240 ###
 partial, missing N-terminus
###Gene_Info_Comments GLEAN3_18250 ###
 extra N-terminus
###Gene_Info_Comments GLEAN3_18251 ###
 partial, missing N-terminus
###Gene_Info_Comments GLEAN3_18271 ###
 missing stretch in middle
###Gene_Info_Comments GLEAN3_18471 ###
 partial, missing N- and C-terminus
###Gene_Info_Comments GLEAN3_18489 ###
 partial, missing C-terminus
###Gene_Info_Comments GLEAN3_18522 ###
 ;partial, missing C-terminus
###Gene_Info_Comments GLEAN3_18751 ###
 partial, missing N- and C-terminus
###Gene_Info_Comments GLEAN3_18872 ###
 partial, missing N- and C-terminus
###Gene_Info_Comments GLEAN3_18882 ###
 partial, missing N-terminus
###Gene_Info_Comments GLEAN3_18894 ###
 partial, missing C-terminus
###Gene_Info_Comments GLEAN3_18900 ###
 partial, missing N- and C-terminus
###Gene_Info_Comments GLEAN3_18975 ###
 extra N- and C-terminus
###Gene_Info_Comments GLEAN3_19004 ###
 extra stretch in middle
###Gene_Info_Comments GLEAN3_19007 ###
 partial, missing C-terminus
###Gene_Info_Comments GLEAN3_19074 ###
 partial, missing N- and C-terminus
###Gene_Info_Comments GLEAN3_19118 ###
 partial, missing N-terminus
###Gene_Info_Comments GLEAN3_19177 ###
 extra stretch in middle
###Gene_Info_Comments GLEAN3_19278 ###
 extra C-terminus
###Gene_Info_Comments GLEAN3_19297 ###
 partial, missing N-terminus
###Gene_Info_Comments GLEAN3_19409 ###
 partial, missing C-terminus
###Gene_Info_Comments GLEAN3_19417 ###
 partial, missing N-terminus and central stretch
###Gene_Info_Comments GLEAN3_19634 ###
 extra stretch in middle
###Gene_Info_Comments GLEAN3_19640 ###
 partial, missing N-terminus
###Gene_Info_Comments GLEAN3_19712 ###
 extra C-terminus
###Gene_Info_Comments GLEAN3_19771 ###
 partial, missing N-terminus
###Gene_Info_Comments GLEAN3_19781 ###
 extra N-terminus
###Gene_Info_Comments GLEAN3_19811 ###
 partial, missing N-terminus
###Gene_Info_Comments GLEAN3_19885 ###
 partial, missing N-terminus
###Gene_Info_Comments GLEAN3_19970 ###
 partial, missing N-terminus
###Gene_Info_Comments GLEAN3_20038 ###
 missing C-terminus
###Gene_Info_Comments GLEAN3_20060 ###
 extra N-terminus
###Gene_Info_Comments GLEAN3_20118 ###
 missing N-terminus
###Gene_Info_Comments GLEAN3_20138 ###
 missing N-terminus
###Gene_Info_Comments GLEAN3_20143 ###
 extra N-terminus
###Gene_Info_Comments GLEAN3_20155 ###
 partial, missing N-terminus
###Gene_Info_Comments GLEAN3_20162 ###
 missing N-terminus
###Gene_Info_Comments GLEAN3_20250 ###
 missing central region
###Gene_Info_Comments GLEAN3_20362 ###
 extra N-terminus, missing stretch in midddle
###Gene_Info_Comments GLEAN3_20435 ###
 missing N-terminus
###Gene_Info_Comments GLEAN3_20530 ###
 missing N-terminus
###Gene_Info_Comments GLEAN3_20673 ###
 extra N- and C-terminus
###Gene_Info_Comments GLEAN3_20679 ###
 extra N-terminus
###Gene_Info_Comments GLEAN3_20757 ###
 extra stretch in middle
###Gene_Info_Comments GLEAN3_20758 ###
 partial, missing C-terminus
###Gene_Info_Comments GLEAN3_20759 ###
 missing N-terminus
###Gene_Info_Comments GLEAN3_20773 ###
 extra N-terminus
###Gene_Info_Comments GLEAN3_20819 ###
 extra N-terminus
###Gene_Info_Comments GLEAN3_20850 ###
 extra N-terminus
###Gene_Info_Comments GLEAN3_20880 ###
 extra N-terminus
###Gene_Info_Comments GLEAN3_21035 ###
 missing stretches in middle
###Gene_Info_Comments GLEAN3_21103 ###
 missing N-terminus
###Gene_Info_Comments GLEAN3_21267 ###
 missing N-terminus
###Gene_Info_Comments GLEAN3_21559 ###
 partial, missing N- and C-terminus
###Gene_Info_Comments GLEAN3_21598 ###
 extra N-terminus
###Gene_Info_Comments GLEAN3_21625 ###
 partial, missing C-terminus
###Gene_Info_Comments GLEAN3_21652 ###
 extra N-terminus
###Gene_Info_Comments GLEAN3_21775 ###
 missing N-terminus residues
###Gene_Info_Comments GLEAN3_21781 ###
 missing N-terminus
###Gene_Info_Comments GLEAN3_21862 ###
 extra N-terminus
###Gene_Info_Comments GLEAN3_21941 ###
 missing N-terminus
###Gene_Info_Comments GLEAN3_21997 ###
 partial, missing N-terminus
###Gene_Info_Comments GLEAN3_22108 ###
 extra N-terminus
###Gene_Info_Comments GLEAN3_22135 ###
 missing C-terminus
###Gene_Info_Comments GLEAN3_22159 ###
 missing N-terminus
###Gene_Info_Comments GLEAN3_22186 ###
 missing N-terminus
###Gene_Info_Comments GLEAN3_22349 ###
 missing N-terminus
###Gene_Info_Comments GLEAN3_22476 ###
 extra N-terminus
###Gene_Info_Comments GLEAN3_22548 ###
 missing C-terminus
###Gene_Info_Comments GLEAN3_22549 ###
 extra N-terminus
###Gene_Info_Comments GLEAN3_22584 ###
 missing N-terminus
###Gene_Info_Comments GLEAN3_22628 ###
 missing C-terminus
###Gene_Info_Comments GLEAN3_22639 ###
 partial, missing N-terminus
###Gene_Info_Comments GLEAN3_22796 ###
 missing N-terminus
###Gene_Info_Comments GLEAN3_22900 ###
 extra stretch in middle
###Gene_Info_Comments GLEAN3_22926 ###
 missing N-terminus
###Gene_Info_Comments GLEAN3_23011 ###
 missing N-terminus
###Gene_Info_Comments GLEAN3_23059 ###
 extra N-terminus
###Gene_Info_Comments GLEAN3_23144 ###
 missing N-terminus
###Gene_Info_Comments GLEAN3_23220 ###
 extra N-terminus
###Gene_Info_Comments GLEAN3_23237 ###
 extra N-terminus
###Gene_Info_Comments GLEAN3_23311 ###
 partial, missing C-terminus
###Gene_Info_Comments GLEAN3_23630 ###
 partial, missing N-terminus
###Gene_Info_Comments GLEAN3_23634 ###
 partial, missing C-terminus
###Gene_Info_Comments GLEAN3_23638 ###
 partial, missing N-terminus
###Gene_Info_Comments GLEAN3_23689 ###
 partial, missing N-terminus
###Gene_Info_Comments GLEAN3_23693 ###
 extra N-terminus
###Gene_Info_Comments GLEAN3_23816 ###
 partial
###Gene_Info_Comments GLEAN3_23909 ###
 extra N-terminus
###Gene_Info_Comments GLEAN3_23972 ###
 missing C-terminus
###Gene_Info_Comments GLEAN3_23979 ###
 missing C-terminus
###Gene_Info_Comments GLEAN3_24234 ###
 missing C-terminus
###Gene_Info_Comments GLEAN3_24264 ###
 missing C-terminus
###Gene_Info_Comments GLEAN3_24298 ###
 missing N-terminus
###Gene_Info_Comments GLEAN3_24312 ###
 partial, missing C-terminus
###Gene_Info_Comments GLEAN3_24355 ###
 partial, missing N-terminus
###Gene_Info_Comments GLEAN3_24450 ###
 missing N-terminus
###Gene_Info_Comments GLEAN3_24522 ###
 missing N-terminus
###Gene_Info_Comments GLEAN3_24629 ###
 missing N-terminus
###Gene_Info_Comments GLEAN3_24639 ###
 partial, missing C-terminus
###Gene_Info_Comments GLEAN3_24668 ###
 partial, missing C-terminus
###Gene_Info_Comments GLEAN3_24736 ###
 partial, missing C-terminus
###Gene_Info_Comments GLEAN3_24774 ###
 partial, missing N- and C-terminus
###Gene_Info_Comments GLEAN3_24862 ###
 partial, missing N-terminus
###Gene_Info_Comments GLEAN3_24895 ###
 partial, missing N- and C-terminus
###Gene_Info_Comments GLEAN3_24949 ###
 extra N-terminus
###Gene_Info_Comments GLEAN3_25014 ###
 partial, missing N-terminus
###Gene_Info_Comments GLEAN3_25024 ###
 missing N-terminus
###Gene_Info_Comments GLEAN3_25036 ###
 extra N-terminus
###Gene_Info_Comments GLEAN3_25052 ###
 partial, missing N-terminus
###Gene_Info_Comments GLEAN3_25123 ###
 missing C-terminus
###Gene_Info_Comments GLEAN3_25196 ###
 missing some N-terminus residues
###Gene_Info_Comments GLEAN3_25227 ###
 extra N-terminus, missing C-terminus
###Gene_Info_Comments GLEAN3_25313 ###
 missing C-terminus and stretch in middle
###Gene_Info_Comments GLEAN3_25469 ###
 partial, missing C-terminus
###Gene_Info_Comments GLEAN3_25545 ###
 missing C-terminus and stretch in middle
###Gene_Info_Comments GLEAN3_25546 ###
 missing N-terminus
###Gene_Info_Comments GLEAN3_25709 ###
 missing N-terminus
###Gene_Info_Comments GLEAN3_25728 ###
 missing N-terminus and stretch in middle
###Gene_Info_Comments GLEAN3_25751 ###
 missing N-terminus
###Gene_Info_Comments GLEAN3_25758 ###
 missing N- and C-terminus
###Gene_Info_Comments GLEAN3_25917 ###
 missing N-terminus
###Gene_Info_Comments GLEAN3_25958 ###
 partial, missing C-terminus
###Gene_Info_Comments GLEAN3_26160 ###
 extra stretch in middle
###Gene_Info_Comments GLEAN3_26220 ###
 extra N-terminus
###Gene_Info_Comments GLEAN3_26273 ###
 missing C-terminus
###Gene_Info_Comments GLEAN3_26311 ###
 missing N-terminus
###Gene_Info_Comments GLEAN3_26351 ###
 extra N-terminus
###Gene_Info_Comments GLEAN3_26352 ###
 missing N-terminus
###Gene_Info_Comments GLEAN3_26475 ###
 partial, missing N- and C-terminus
###Gene_Info_Comments GLEAN3_26558 ###
 missing N-terminus and stretch in middle
###Gene_Info_Comments GLEAN3_26714 ###
 extra N-terminus
###Gene_Info_Comments GLEAN3_26718 ###
 partial, missing C-terminus
###Gene_Info_Comments GLEAN3_26807 ###
 extra N-terminus
###Gene_Info_Comments GLEAN3_26833 ###
 partial, missing N-terminus
###Gene_Info_Comments GLEAN3_26931 ###
 missing C-terminus and stretch in middle
###Gene_Info_Comments GLEAN3_27152 ###
 partial, missing N- and C-terminus
###Gene_Info_Comments GLEAN3_27176 ###
 extra N- and C-terminus
###Gene_Info_Comments GLEAN3_27179 ###
 partial, missing C-terminus
###Gene_Info_Comments GLEAN3_27374 ###
 partial, missing N-terminus
###Gene_Info_Comments GLEAN3_27419 ###
 missing stretch in middle, extra C-terminus
###Gene_Info_Comments GLEAN3_27728 ###
 partial, missing C-terminus
###Gene_Info_Comments GLEAN3_27756 ###
 partial, missing C-terminus
###Gene_Info_Comments GLEAN3_27758 ###
 partial, missing C-terminus
###Gene_Info_Comments GLEAN3_27814 ###
 missing N-terminus
###Gene_Info_Comments GLEAN3_27852 ###
 extra N-terminus, missing stretch in middle
###Gene_Info_Comments GLEAN3_27870 ###
 partial, missing C-terminus
###Gene_Info_Comments GLEAN3_28096 ###
 extra N-terminus
###Gene_Info_Comments GLEAN3_28097 ###
 extra stretch in middle
###Gene_Info_Comments GLEAN3_28111 ###
 partial, missing C-terminus
###Gene_Info_Comments GLEAN3_28274 ###
 missing C-terminus
###Gene_Info_Comments GLEAN3_28305 ###
 missing N-terminus
###Gene_Info_Comments GLEAN3_28451 ###
 partial, missing C-terminus
###Gene_Info_Comments GLEAN3_28493 ###
 extra N-terminus
###Gene_Info_Comments GLEAN3_28554 ###
 extra N-terminus
###Gene_Info_Comments GLEAN3_28562 ###
 extra N-terminus
###Gene_Info_Comments GLEAN3_28586 ###
 missing N- and C-terminus
###Gene_Info_Comments GLEAN3_28763 ###
 extra N-terminus, missing C-terminus
###Gene_Info_Comments GLEAN3_28794 ###
 partial, missing C-terminus
###Gene_Info_Comments GLEAN3_28836 ###
 partial, missing C-terminus
###Gene_Info_Comments GLEAN3_00314 ###
  missing some N-terminus residues
###Gene_Info_Comments GLEAN3_02759 ###
 partial, missing N- and C-terminus
###Gene_Info_Comments GLEAN3_02977 ###
 partial, missing N-terminus
###Gene_Info_Comments GLEAN3_03360 ###
 partial, missing N- and C-terminus
###Gene_Info_Comments GLEAN3_04289 ###
 extra N-terminus, missing stretch in middle
###Gene_Info_Comments GLEAN3_05521 ###
 partial, missing stretch in middle
###Gene_Info_Comments GLEAN3_05684 ###
 partial, missing C-terminus
###Gene_Info_Comments GLEAN3_05984 ###
 partial, missing C-terminus
###Gene_Info_Comments GLEAN3_06142 ###
 partial, missing C-terminus
###Gene_Info_Comments GLEAN3_06330 ###
 extra stretch in middle
###Gene_Info_Comments GLEAN3_06470 ###
 partial, missing N-terminus
###Gene_Info_Comments GLEAN3_06968 ###
 partial, missing stretch in middle
###Gene_Info_Comments GLEAN3_07015 ###
  partial, missing N-terminus
###Gene_Info_Comments GLEAN3_08652 ###
 missing C-terminus
###Gene_Info_Comments GLEAN3_08683 ###
 extra residues on N-terminus, missing stretch in middle
###Gene_Info_Comments GLEAN3_10567 ###
 extra stretch in middle
###Gene_Info_Comments GLEAN3_11965 ###
 partial, missing C-terminus
###Gene_Info_Comments GLEAN3_11983 ###
 partial, missing N-terminus
###Gene_Info_Comments GLEAN3_14671 ###
 extra C-terminus
###Gene_Info_Comments GLEAN3_17318 ###
 missing stretches in middle
###Gene_Info_Comments GLEAN3_18102 ###
  partial, missing C-terminus
###Gene_Info_Comments GLEAN3_19330 ###
 missing N-terminus
###Gene_Info_Comments GLEAN3_23389 ###
 missing C-terminus
###Gene_Info_Comments GLEAN3_23503 ###
 missing N-terminus
###Gene_Info_Comments GLEAN3_21699 ###
 missing N-terminus
###Gene_Info_Comments GLEAN3_21700 ###
 missing N-terminus
###Gene_Info_Comments GLEAN3_21800 ###
 missing C-terminus
###Gene_Info_Comments GLEAN3_24133 ###
 missing C-terminus
###Gene_Info_Comments GLEAN3_27188 ###
 missing stretch in middle
###Gene_Info_Comments GLEAN3_24028 ###
 missing N-terminus half
###Gene_Info_Comments GLEAN3_10213 ###
 extra C-terminus
###Gene_Info_Comments GLEAN3_14975 ###
 extra N-terminus
###Gene_Info_Comments GLEAN3_06894 ###
 missing N-terminus and stretch in middle
###Gene_Info_Comments GLEAN3_09793 ###
 partial, missing C-terminus
###Gene_Info_Comments GLEAN3_16830 ###
 missing N-terminus
###Gene_Info_Comments GLEAN3_18973 ###
 missing N-terminus
###Gene_Info_Comments GLEAN3_21020 ###
 missing N-terminus
###Gene_Info_Comments GLEAN3_26855 ###
 missing central and C-terminus
###Gene_Info_Comments GLEAN3_28730 ###
 extra stretch in middle
###Gene_Info_Comments GLEAN3_21364 ###
 missing N-terminus residues
###Gene_Info_Comments GLEAN3_23364 ###
 extra N-terminus and missing C-terminus
###Gene_Info_Comments GLEAN3_23440 ###
 missing N-terminus
###Gene_Info_Comments GLEAN3_25997 ###
 missing N-terminus
###Gene_Info_Comments GLEAN3_26183 ###
 missing N-terminus
###Gene_Info_Comments GLEAN3_28101 ###
 extra C-terminus
###Gene_Info_Comments GLEAN3_28199 ###
 extra N-terminus
###Gene_Info_Comments GLEAN3_20635 ###
 partial, missing N-terminus
###Gene_Info_Comments GLEAN3_10555 ###
 partial, missing N- and C-terminus
###Gene_Info_Comments GLEAN3_12120 ###
 partial, missing C-terminus
###Gene_Info_Comments GLEAN3_13820 ###
 missing N-terminus; unrelated stretch in middle
###Gene_Info_Comments GLEAN3_14753 ###
 missing N-terminus
###Gene_Info_Comments GLEAN3_14982 ###
 partial, missing C-terminus
###Gene_Info_Comments GLEAN3_16013 ###
 partial, missing N-terminus
###Gene_Info_Comments GLEAN3_17727 ###
 
###Gene_Info_Comments GLEAN3_18659 ###
 extra C-terminus, missing stretch in middle
###Gene_Info_Comments GLEAN3_18893 ###
 extra C-terminus
###Gene_Info_Comments GLEAN3_19341 ###
 partial, missing N- and C-terminus
###Gene_Info_Comments GLEAN3_20002 ###
 missing C-terminus
###Gene_Info_Comments GLEAN3_20176 ###
 missing N-terminus
###Gene_Info_Comments GLEAN3_21295 ###
 partial, missing C-terminus
###Gene_Info_Comments GLEAN3_24989 ###
 extra N-terminus
###Gene_Info_Comments GLEAN3_25126 ###
 extra N-ter,  missing C-terminus and stretch in middle
###Gene_Info_Comments GLEAN3_26673 ###
 extra N-terminus and stretch in middle
###Gene_Info_Comments GLEAN3_28146 ###
 partial, missing N-terminus
###Gene_Info_Comments GLEAN3_28175 ###
 extra stretches in middle
###Gene_Info_Comments GLEAN3_28446 ###
 missing N-terminus
###Gene_Info_Comments Sp-Tlr228 ###
Partial Toll-like receptor predicted by Fgenesh, NCBI and Genscan. This gene model occupies all sequence of a short scaffold and the nucleotides have 96% identity to a typical Sp-Tlr (GLEAN3_27798). 

###Gene_Info_Comments GLEAN3_21908 ###
Partial Toll-like receptor. The nucleotides of TIR domain have 99% identity to GLEAN3_21907. Only 200bp of nucleotides in 5' upstream has high similarity to another Sp-Tlr gene. This gene may represent a recent duplication or assembly error.
This is a member of sea urchin-specific Tlr Group IA.
###Gene_Info_Comments GLEAN3_21907 ###
Partial Toll-like receptor.  The nucleotide sequence has 94% identity to another Sp-Tlr gene, so it could be a member of Toll-like receptor related gene although no LRR is found in the upstream sequence. 
This is a member of sea urchin-specific Tlr Group IA.
###Gene_Info_Comments GLEAN3_04792 ###
Partial Toll-like receptor. The nucleotides of this gene have 94% identity to a typical Sp-Tlr gene (GLEAN3_05950). This is located at the end of a cintig and Unkown sequence (NNN) in the upstream region could make this gene model incomplete.
###Gene_Info_Comments GLEAN3_06278 ###
Possible duplicated gene: GLEAN3_09260
###Gene_Info_Comments GLEAN3_09260 ###
Possible duplicated gene: GLEAN3_06278
###Gene_Info_Comments GLEAN3_10513 ###
3' Partial  
Glean3_10488 is the 5' part of this gene
###Gene_Info_Comments GLEAN3_10488 ###
5'partial
###Gene_Info_Comments GLEAN3_20282 ###
Possible duplication: GLEAN3_22904
###Gene_Info_Comments GLEAN3_22904 ###
Possible duplication: GLEAN3_20282
###Gene_Info_Comments GLEAN3_21512 ###
Best empirically verified GenBank hit is poly(A) polymerase in Carassius auratus (goldfish), accession BAB39139, with E-value 0.0 and bit score 642.

PSSMs producing significant alignments (indicating conserved domains) include pfam04928, "PAP_central, Poly(A) polymerase central domain" and pfam04926, "PAP_RNA-bind, Poly(A) polymerase predicted RNA binding domain".

The GLEAN gene model has been modified as follows:
* The 3' UTR has been added, based on the Samanta embryonic expression data.
* The first exon has been extended 5', based on BCM:Exonerate, NCBI:Splign, and the Stolc tiling array data.
###Gene_Info_Comments GLEAN3_21760 ###
Different parts of this gene are found in different scaffolds in a non-linear organization.
###Gene_Info_Comments GLEAN3_13202 ###
Different parts of this gene are found in different scaffolds in a non-linear organization.
###Gene_Info_Comments GLEAN3_21651 ###
Different parts of this gene are found in different scaffolds in a non-linear organization.
###Gene_Info_Comments GLEAN3_05522 ###
Different parts of this gene are found in different scaffolds in a non-linear organization. 
###Gene_Info_Comments GLEAN3_07087 ###
Different parts of this gene are found in different scaffolds in a non-linear organization.
###Gene_Info_Comments GLEAN3_13967 ###
Different parts of this gene are found in different scaffolds in a non-linear organization.
###Gene_Info_Comments GLEAN3_13930 ###
Different parts of this gene are found in different scaffolds in a non-linear organization.
###Gene_Info_Comments Sp-VC1_3 ###
This gene model is located at the end of a short scaffold (Scaffold120560).The nucleotides have 87% identity to another Sp-VC1 gene.
###Gene_Info_Comments GLEAN3_17609 ###
Duplicate prediction for GLEAN3_15676
###Gene_Info_Comments GLEAN3_02696 ###
This model was annotated based on a manual inspection of multiple protein sequence alignments.

There is an overlapping NCBI model that includes less exons but shows a similar alignment to vertebrate Map3k7. The size of this model is closer to that of its vertebrate counterpart, and thus we have decided to accept this model in its present form.
###Gene_Info_Comments GLEAN3_05254 ###
This model was annotated based on a manual inspection of multiple protein sequence alignments.
###Gene_Info_Comments GLEAN3_03955 ###
This model was annotated based on a manual inspection of multiple protein sequence alignments.

It is possible that there is some N-terminus sequence missing from this model, as based on alignments to vertebrate Tab2/3 and given that this model is located next to a region of various gaps between contigs.

There seems to be a duplication of this model (GLEAN3_12219). See the Gene Duplication page for further details.
###Gene_Info_Comments GLEAN3_12219 ###
This model was annotated based on a manual inspection of multiple protein sequence alignments.

It is possible that there is some N-terminus sequence missing from this model, as based on alignments to vertebrate Tab2/3 and given that this model is located next to a region of various gaps between contigs.

There seems to be a duplication of this model (GLEAN3_03955). See the Gene Duplication page for further details.
###Gene_Info_Comments GLEAN3_18598 ###
This model was annotated based on a manual inspection of multiple protein sequence alignments.

All other gene prediction protocols provide an identical structure for this gene, which is also supported by the genome-wide tiling array hibridization data.
###Gene_Info_Comments GLEAN3_00742 ###
This model was annotated based on a manual inspection of multiple protein sequence alignments.

This glean model is, in its present form, slightly largert than vertebrate Ube2 genes. An overlapping NCBI model (XM_791573.1) is slightly shorter; however it does not show a better alignment to vertebrate Ube2 genes. Therefore, in lack of additional evidence to favour either model, we have accepted the glean sequence in its present form.
###Gene_Info_Comments GLEAN3_28607 ###
first exon with cadherin-like sequences is most likely irrelevant
###Gene_Info_Comments GLEAN3_06829 ###
likely histone H2a pseudogene
###Gene_Info_Comments GLEAN3_12627 ###
likely histone H2a pseudogene
###Gene_Info_Comments GLEAN3_17950 ###
short protein 0nly 99aa.
Originally called H2bk, changed to pseudo-gene on 28 July 2006.
###Gene_Info_Comments GLEAN3_20123 ###
Sp-Elf has two splice variants differing in the 5' region:
Sp-Elf A       GLEAN3_20124
Sp-Elf B       GLEAN3_20123

###Gene_Info_Comments GLEAN3_19879 ###
Sequence xp_786867 has been  predicted by automated computational analysis. 
The next best match is AB051576.1 Shiwa,M., Murayama,T. and Ogawa,Y. Molecular cloning and characterization of ryanodine receptor from unfertilized sea urchin eggs
  JOURNAL   Am. J. Physiol. Regul. Integr. Comp. Physiol. 282 (3), R727-R737 (2002)

            This record is derived from an annotated genomic sequence
            (NW_791670) using gene prediction method: GNOMON, supported by EST
            evidence.

###Gene_Info_Comments GLEAN3_11042 ###
This model was annotated and modified based on a manual inspection of multiple protein sequence alignments.

Vertebrate and insect SARM proteins contain only 2 SAM domains, whereas the original version of this glean model coded for 3 SAM domains. When the nucleotide sequence of this model was inspected in detail, we noticed that there were some identical exon sequences that likely resulted from assembly problems. We have modified this model accordingly (following the NCBI model), and this modified glean model now presents a domain structure like that found in mammalian and insect SARM.
###Gene_Info_Comments GLEAN3_21673 ###
This model was annotated based on manual inspections of multiple protein sequence alignments.

This model is identical in aminoacidic sequence to GLEAN3_26252, and an inspection of the models strongly suggests the duplication is due to an assembly error.

Please refer to GLEAN3_26252 for further annotation details (exon structure, sequence, etc).
###Gene_Info_Comments GLEAN3_26252 ###
This model was annotated based on manual inspections of multiple protein sequence alignments.

This model is identical in aminoacidic sequence to GLEAN3_21673, and an inspection of the models strongly suggests the duplication is due to an assembly error.
###Gene_Info_Comments GLEAN3_10374 ###
Gene model includes 21 tandem Fibronectin Type 3 repeats
###Gene_Info_Comments GLEAN3_14498 ###
Predicted protein sequence matches exactly to est-derived prediction Sp-Gg1d, except an intron is found in 3' UTR
###Gene_Info_Comments GLEAN3_18408 ###
The gene model contains two exons that are not present in this protein as determined by cDNA sequencing.
###Gene_Info_Comments GLEAN3_05096 ###
ATP-dependent RNA helicase A (Nuclear DNA helicase II) (NDH II) (DEAH-box protein 9)
###Gene_Info_Comments GLEAN3_04517 ###
This model was modified and annotated based on a manual inspection of multiple protein sequence alignments.

We found that there was a gap in the alignment of the original version of this model with vertebrate/insect Pellino, which mapped to exon#5. The corresponding NCBI model, otherwise identical, has a slightly shorter exon#5, and shows a better alignment to other Pellino proteins. Therefore, we have decided to modify the GLEAN3 prediction accordingly.

NB: The CDS for this model does not end with a STOP codon (i.e. there might be some C-ter sequence missing for this gene).
###Gene_Info_Comments GLEAN3_08228 ###
There is unknown sequence (NNN) in the intron of this gene model. So It is still unknown if this model is intronless or not.

###Gene_Info_Comments GLEAN3_19314 ###
cyclin G associated kinase/DnaJ (HSP 40) homolog. Gene prediction is not complete. GLEAN3_00818 is a related/similar protein.
###Gene_Info_Comments GLEAN3_12096 ###
This model was annotated based on a manual inspection of multiple protein sequence alignments.

Note there is a slightly different FgeneshAB model for this gene; however, it does not provide a better alignment with other Ecsit proteins, and we have therefore decided to accept this model in its present form until additional evidence is obtained.
###Gene_Info_Comments GLEAN3_09399 ###
Sp-MAP2K5 spans two glean prediction:
GLEAN3_09399 and GLEAN3_09398
###Gene_Info_Comments GLEAN3_00818 ###
cyclin G associated kinase/DnaJ (HSP 40) homolog. Gene prediction is not complete. GLEAN3_19314 is a related/similar protein.
###Gene_Info_Comments Sp-Gg2 ###
created gene model on basis of est data
###Gene_Info_Comments GLEAN3_23706 ###
This model was annotated based on a manual inspection of multiple protein sequence alignments.

While the best Blast hit for this model is to Transmembrane protease, serine 4 (Membrane-type serine protease 2)(MT-SP2), a careful inspection of its size and domain composition reveals that it more generally resembles members of the granzyme family and vertebrate marapsin. We therefore propose to name this and related genes Sp-Gra[nzyme]mar[apsin]-like.

The location of this model in the scaffold (away from ends or gaps), and the fact that other gene prediction protocols generated identical models strongly suggest that this is not an incomplete model.
###Gene_Info_Comments GLEAN3_01588 ###
This model was annotated based on a manual inspection of multiple protein sequence alignments.

While the best Blast hit for this model is to sea star regeneration-associated protease SRAP, a careful inspection of its size and domain composition reveals that it generally resembles members of the granzyme family and vertebrate marapsin. We therefore propose to name this and related genes Sp-Gra[nzyme]mar[apsin]-like.

The location of this model in the scaffold (away from ends or gaps), and the fact that other gene prediction protocols generated very similar models strongly suggest that this is not an incomplete model.
###Gene_Info_Comments GLEAN3_16107 ###
This model was annotated based on a manual inspection of multiple protein sequence alignments.

While the best Blast hit for this model is to sea star regeneration-associated protease SRAP, a careful inspection of its size and domain composition reveals that it more generally resembles members of the granzyme family and vertebrate marapsin. We therefore propose to name this and related genes Sp-Gra[nzyme]mar[apsin]-like.

The location of this model in the scaffold (away from ends or gaps), and the fact that other gene prediction protocols generated very similar models strongly suggest that this is not an incomplete model.
###Gene_Info_Comments GLEAN3_01225 ###
Homologue to GGDEF and Neurotrophin. Am working to find out what it is exactly.
Similar to genescan predicted peptide: Supertig12270|GENSCAN_predicted_peptide_1|302_aa
CHIMERIC PROTEIN: 
Supertig12270|GENSCAN_predicted_peptide_1|302_aa CONTAINS A PIECE OF SEQUENCE WHICH IS IN Supertig153688|GENSCAN_predicted_peptide_1|85_aa and 
###Gene_Info_Comments GLEAN3_22264 ###
E1B-55kDa-associated protein 5 isoform a. GLEAN3_22264 and GLEAN3_22265 are likely incomplete/incorrect predictions for this gene.
###Gene_Info_Comments GLEAN3_22265 ###
E1B-55kDa-associated protein 5 isoform a. GLEAN3_22264 and GLEAN3_22265 are likely incomplete/incorrect predictions for this gene.
###Gene_Info_Comments GLEAN3_06871 ###
Inspection of the tiling array suggests that glean may have missed the following exons: KLYPVTLWKAHLVTKSAKMQEMGWVRKIMHGRLRNDTSGLSAGRSSLNLLLTGCCGWGGCGCCWSCCCCCCGGGPRGCWSTCCGWPIARPPPAGCGWTG,YQHVSQRSLQDKSKLSKILFSHMAKYYSKLHSLKCDIKIIIMQNIVSPGLLLILNLLQKLVYCDCALAIPSLGLKTPFLEK
###Gene_Info_Comments GLEAN3_07011 ###
Inspection of the tiling array suggests that glean may have missed the following exons: DINCVPLDLSIKRTNPQETSENEQEVGEEPLVEEPRMGEEPREEEPLVEELSMGEEPKGEESMQGGLLMREEPSEGELEGEEEGFEEQPGEYDSLDEELWVGGKPIKEEPLDEEQEREENIGLWEGALVEGPEGEEESLEEEPLVEEPEGEKEPLEEEEPEGEEEPEGEEPEGEESPEDSAAWIPV
###Gene_Info_Comments GLEAN3_07046 ###
Inspection of the tiling array suggests that glean may have missed the following exons: NGKHGEEGQREGGKENGKHREEGQREGMRRMRDMGIRDREREGGEWETWRGGTERGREENKKHREEGQREGGRRMRNMGMRDRERE
###Gene_Info_Comments GLEAN3_07360 ###
Inspection of the tiling array suggests that glean may have missed the following exons: EDDEDNDYFDPNESVVEEEESMEQSGTDGEDDDGVGDRGQVLPKEPKRADTKKGRSTSNKALICNICGLECEHGKALKQHLISHDPKSLQCSYCKRYFKRKGCLVFHLRTKHQVSIGKKWSRHEKEDLMTPSKIDEKGDDGDNDYFDPNERVVEKDESQEQSGTDGEDDEWVEDRGKELRKRRKRAVTKMSRNTSDKALICKICGLECEHSKSLKQHLISHDPNALQCSYCKWYFRRKSCLVFHLRTKHRVNVGKKWSRGNELVKNAVRAKAPKALEPKDLGDLQGGASTQLEDSSTTTILYSCKFCTKKFTKPDFLLKHEAIVHVNFRRYRCRVCRKAFSTKYALQSHSHIHVGEKRFECFICNRKFNSNSLLVRHLMHHDKPDNSDLVFAAMQPHVLSESGDPIDETVEAESAAPSTIDV,AESDPRSTHERHDKDQEQSQKLDPKEVKDQTQDKIFEQDDENKKTLPVCEICGEECKHNMALKQHLLSHDPTTYQCQYCDWFFKRKGCLIFHLRTKHKISAGRKWLRGTIDELLERDNALEDDSEEARLEEELRKQKRLQRLANNPKGPRHRCKLCGKECEHTRALKQHVMSHDPKSFQCKFCKWYFKRKGCLVFHLRTKHQVSVGKKWSRLEKEDLMTRSKTDEEENDEDNDYFDPN
###Gene_Info_Comments GLEAN3_07361 ###
Inspection of the tiling array suggests that glean may have missed the following exons: VREGIPHQKVHAEAQAKAAHDTIQTLQVQLLRQDFQRQHWARAPRAPPQGYPASCLPDLWKGIWDQVFLADAPAGPYRREEVLVPHLRSEICLEQHPYTASPAA,KKFPSEGHLKEHAAFHKEMRDVRPICEVCGLECKHNKALKQHLLSHNPHAYQCEFCKRYFRRRGCLVYHLRHIHQTFVGKKWTRGNAQEQMTRYVGPGEEDEEEEFHPELAKDMLGKHILIKSKKPRVFKCQFCPKKFIRHNLVCKHERTVHMNNGRFKCEFCTKTFMEEYNYTLHKRKHTKERPFKCTECPQSFASEKALINHQPEHRGERPFKCDECGKAFRTRKYMLKHKRRQHMTPSRLFKCSYCDKTFKDNTGRERHERRHKGIRPHVCLTCGKAFGTKYSLQTHLQVHTGEKKFSCHICDQRFALNNTLIRHLLRHDKVAASEDPALITMQEEVTVQNNSTASTGLQEVQL,GSNVQGHYQQEPAPHSQHGMPLQQIPSLPSAQQPSPPLQQGQAHTGSGDAYHIEDLSHNTSHSRLATSTSGGPMNTTPGSIGGCANTTKSNSRVIASTGRTPKKRAAPGSSPKQPSKPPSMRAPMTQIVTPQAYQQAQMLQQQQQHQPRPRECMSFSKHCQTEPLFQQVYNASMQFTKENDLPDGLEFTEDEKEKKISGVVATKDFEPGVEFGPFTGEFVKEGLGCFNPNTWEVIEQNHK
###Gene_Info_Comments GLEAN3_08358 ###
Inspection of the tiling array suggests that glean may have missed the following exons: GFGYQREEVLFQIWDKRGGEKVAFLLDFVREAPKVEVQSLQKEGTWGNYKDSSQPPPTSPLPETPSLAKDTQTPPTSPLPETPSLAKDNQTPPTSPLPETQSLAKDTQTPPTSPLPETPFLAKDTQTPPTSPLPETPSLAKDTQTPPTSPLPETQSLAKDTQTPPTSPLPETQSLAKDTQTPPTSPLPETPSLAKDTQTPPTSPLPETLSPAKDTQTPPTSPLLEIPPHHDDDGISINTNPVDGWYEYTNDTGTQTSDPEDHIGKLMDRCRISIKQNPDSKRTEACSETQSYFHTLKSRES,QNRGKMVQRHGNQTEDGERKCLEDRVWNGWKEQVVYSPFTSVASSDCENFELTELKRQLQELFERQPTKLVMTPERKIFDGKAAEIEDLVISVKRSFSRYGIREEGKRSPFSWILSEKLQR,RFNLSRRKEPGGTTKTLPNLHQLHLSQKLRPWLKTPKHHQLPLSQKLRPWLKTTKHHQLPLSQKPSPLLKTPKHHQLPLSQKPRSWLKTLKHHQLPLSQKPRPWLKTPKHHQLPLSQKLSPWLKTPKHHQLPLSQKLSPWLKTPKHHQLPLSQKPHPWLKTPKHHQLPLSQKPCPRLKTPKHHQLPLS,FMNIVYHHRTMANKRKAWSLQGIIAVMQEYLSCHDVYKRMKRNGTTKGFHQTVADKLSYQENATENLLKSLGREYRDILQ,WGSDVLRTRSITGKGSNMEKGKFDTIMSQQKNSSFVKYLAVAIWGSDVLRTRSITGKGSNVDKGKFDTIMSQQKNSSFVKYLALAIWGSDVLRTRSIT,XXXXXXXGYLGFGCSENALDHRKKGKSICEPLELGQKRRHCVVMNLPKFPSSSEPALHLSDHEAVSIVVGLAIFSGTAGSKVHLCFLRKGV
###Gene_Info_Comments GLEAN3_08377 ###
Inspection of the tiling array suggests that glean may have missed the following exons: SVSGRFGSKSCIDRVWGRLFRCIKSLLRPKASYFRFSFNSAKSISIWKCACNSCWSIKDTRLLDIFSFCWLMLVVLRKNSLWKGLSQGVYSPFSSSIKSSCCFC
###Gene_Info_Comments GLEAN3_08563 ###
Inspection of the tiling array suggests that glean may have missed the following exons: SVCNKKYRSSSDLKIHLRTHSGEKPFLCSFCGERCTDIGSLANHTIRFHREWTQQCPQCSKKSVSKSHLKTHMMVHTGEKPYQCPECSKRSATKSSLNKHMLTHNGEKRYECFECKKTYANKGDLYTHKGLIRSITLRNMSKLTPVKGLIDVL,IDIKDAVKVSTKLNGDAISSLRMQSGKQYHQCSVCGRKCPSKSDLARHLRTHTGEKPYPCPECDKRFSDKSSIPQHMLIHSGEKPYECSECSSRFNCKSNLRQHMKQHSTTKFHQCPKCDKK
###Gene_Info_Comments GLEAN3_09474 ###
Inspection of the tiling array suggests that glean may have missed the following exons: CQVACEPTGPIPYAPEHTETHEAEIIRVQQHPNISVQGVHCSEEWQRLYRERLWTQQTSTSSTSKLEAAIHSEVQRRSPGANGQRPYVYPASALIQSDASV
###Gene_Info_Comments GLEAN3_09553 ###
Inspection of the tiling array suggests that glean may have missed the following exons: IVCMGCSKMFCLEENMSQHLRRCKGLRELLKRKKNLRKISSNSDDDDDDGFIPNEEEESCGVLKALEGPDRAMGQSTMGDLNETETFGVETNRQGEDSKGRVIDMLGVRSFQDASKPFKCSYCTKRFLTKNRLLRHKHNRHPKSSVFKCDHCDQTFPYKHRLLKHLPTHNKDKRYKC,KQPLEGRSHLEPPETSWSIRELLCDKEEHFVCVTCGKHFPTNGRLKAHERFHESTCEKFECDMCGAVFKTSLSLMRHKKIHTEIQFKCTLCFKKYTCRSHLSRHMHTAHGFERVRGKILCMGCSRKFLLEDDMLKHLKSCKG
###Gene_Info_Comments GLEAN3_09642 ###
Inspection of the tiling array suggests that glean may have missed the following exons: RSLFLFLPLFFFAFSAKASTRLFWICSCSGSMMAPASISFAAYSRLIPSMDDVRLGPVLALQLWFFKYFSLAYPLLHLSQWKGKSLVWSFMCSARLGLRLNVLEQWKHLKGLIPLCVMM,CSFGTRFGVTVMVLQVLLPGISLVALVTMERKVVGVELHVFSQVRPATECLGAVEAFEGFDPTVRDDVSFELVGSVERHVAACHRVEWTLEFLIRFMDQHVSFEFVLTVELCGADLTAEWFLAGVNENVRLQIVLTLKLLVTDEAFMQGLGAVGDEMASQVPLTSKDLVAFWTVELM,LVYFYKELTPFWLSADEIFDKHVIPILITMPWKAVCESLLPRKLSTEMNIIFCYLFTGGGTKHSFNYQSTEKTIRQSLTGLKKKSFCLSNSKKGFTIREFTD
###Gene_Info_Comments GLEAN3_09685 ###
Inspection of the tiling array suggests that glean may have missed the following exons: LRPLLDAAICSLFCVFLPSNSLSLSLSLSLSLSLSLSPPPPPPPLRLYLYLSLPLLFSVFLPSNSISVSLPLPHSSLFVSSF,FVSLSSSLPLPPSSLFCVFLPPNSLSFFLSLSLTLYLFLSLLLSVYSNLLILSLSLFLSLPLLFSMSSYLLPSLSLPISSLM,MLLYVLFSVSSFLLILSRSLSLSLSLSLSLSLLLLLLLLFDSTSTSPSLFSFLSSFLLILSLSLFLSLTLLFLCLPSNLSLSLPLYLSLPLLFSVSSYLLILSPSSSPSLLLSTSSSLFSCPCIPTS,FSLLLPLPLSYSLPLPLSSLVRVFQPPNSLSLSLPLPPSSLFYVFLPPSLSLSPHIVFNVRQSIHTVIALLVVCLEMALSCRSSTVGSDCLAQGSSKTVRPRDSSSPSGLGICHRTKQ
###Gene_Info_Comments GLEAN3_09831 ###
Inspection of the tiling array suggests that glean may have missed the following exons: ANILSLNQQSATGGQQIVTNGNVQYNITPQYQIDSEGNLITTHVATPVSVSQAQAQTTQTVVRQAPTTTSQATNVAGSNVAQIALPGAGVQYIQNGQIIQAVAPQPAPTQQRIMLGSQTITLQLAPNAVMSNANSHDQPVVTPVYNIISSTPPQGNQDNSAQTQQLQIQDQLQNAQIISNSGQIQAGNAANSNQATFVQVAGKPGQIILQQPQQAGQVQQIQVSGLQTSNTNVQTTQIRQQSGKVVASVIPQQVQQVQQVQQVQQQQTTTSQIISQQSSQTVQQQPQAQVIQIHQPIQGQAGTQISLQQQPGTGYYTI,VKPKLKLPKQLSGKHPLQHPKQQTWLEAMLHKLLCLVPGYSIFRMGRSFRRLHHNQHRLSNALCSALKPSPYNLLRMQSCPMQIHMINLSLHQSTISSAPPRHRATRTILPKLSSFRSKINFRMHKSSQIVAKFKQGMPPTVTRQHLCKLPVNQDKLSCNNHNKQDRCNKYKLAGSRHRIQMFRRRRSDNKVARLWHLLSHSRYNKCNKCSKYSSSRQQHHRLFHSSHHRQFNSSHRHRSYRSINPSRGKRALKSHYSNNQVLATTQ,CKSSPSSNYPNSCQASTHYNIPSNKRGWKQCCTNCFAWCRGTVYSEWADHSGGCTTTSTDSATHYARLSNHHPTTCSECSHVQCKFT,RAPDIEYKCSDDADQTTKWQGCGICYPTAGTTSATSAASTAAADNNITDYFTAVITDSSTAATGTGHTDPSTHPGASGHSNLITATTRYWLLHN,AAKSEHQFPCDWISMFVLVKRQPSGLLLEWNKVQKHNCGGPAVSSLTDIYKNPLFSKSTDWGNRYVRALVRLCRNWSRETTLVKRLPSCKSILTQIHAA,PPPLSLSLSSPPHTPATQQQITQQQQQQISQALVGMKAEKQQQASWQGVIQTEATGGAQGGTVTTISTNGTNYPPTMASYELQIDQSQIPKQEPPKKVRRLACTCPNCKDGDGR,LILSSLISFFFVLGEKKFVCKSCGKKFMRSDHLAKHQRTHIRKPGTVSMKGQSGGGDAPLQVDLSQGVDEFDEEMEKVMRDPQHEAMVQDAVQEAVY,SVLQSIQQINGGQFMQNPIVLKAPQTQVQTVHLQHGGTPVATPQSSSSQVQQQVITTDVSPAISTANNSTIAGLPNYNVHLAPLSPGPGPAGNTSAVNINTSTAGMFNSHLLNIP,VIFRPSDQFQLAPLYVFPHLFVCLFFCRNSEKGKKQHICHIADCGKIYGKTSHLRAHLRWHTGERPFVCDWLFCGKRFTRSDELQRHRRTHTGIPTQIII
###Gene_Info_Comments GLEAN3_09832 ###
Inspection of the tiling array suggests that glean may have missed the following exons: ALLAATCSKIGTPAEGQAGAVNQGQTVTVLGQNQGQAVQIPGGFINAANAQQIQQALGLPPGFPLQFTTASAGVAGQQGTAAGGPMYIEVGPGGNIPSSSVGGATPTKSINAANILSLNQQGATGGQQIVTNGNVQYNITPQYQIDSEGNLITTHVATPVSVSQAQAQTTQTVVRQAPTTTSQATNVAGSNVTQIALPGAGVQYIQNGQIIQAVAPQPAPTQQRIMLGSQTITLQLAPNAVMSNANSHDQPVVTPVYNIISSTPPQGNQDNSAQTQQLQIQDQLQNAQIISNSGQIQAGNAANSNQATFVQVAGKPGQIILQQPQQAGQVQQIQVSGLQTSNTNVQTTQIRQQSGKVVASVIPQQVQQVQQVQQVQQQQTTTSQIISQQSSQTVQQQPQAQVIQIHQPIQGQAGTQISLQQQPGTGYYTI,VKPKLKLPKQLSGKHPLQHPKQQTWLEAMLHKLLCLVPGYSIFRMDRSFKRLHHNQHQLSNALCSALKPSPYNLLRMQSCPMQIHMINLSLHQSTTSSAPPRHRATRTILPKLSSFRSKINFRMHKLSQIVAKFKQGMQPTVTRQHLCKLPVNQDKLSCNNHNKQDRCNKYKLAGSRHLIQMFRRRRSDNKVARLWHLLSHSRYNKCNKCSKYSSSRQQHHRLFHSSHHRQFNSSHRHRSYRSINPSRGKRALKSHYSNNQVLATTQY,FRGKFDHHSCCYPCQCKSSPSSNYPNSCQASTHYNIPSNKRGWKQCYTNCFAWCRGTVYSEWTDHSSGCTTTSTNSATHYARLSNHHPTTCSECSHVQCKFT,YKCSDDADQTTKWQGCGICYPTAGTTSATSAASTAAADNNITDYFTAVITDSSTAATGTGHTDPSTHPGASGHSNLITATTRYWLLHN,ILILSSLISFFFVPGEKKFVCKSCGKKFMRSDHLAKHQRTHIRKPGTVSMKGQSGGGDAPLQVDLSQGVDEFDEEMEKVMRDPQHEAMVQDAVQEAVY,SVLQSIQQINGGQFMQNPIVLKTPQTQVQTVHLQHGGTPVATPQSSSSQVQQQVITTDVSPAISTANNSTIAGLPNYNVHLAPLSPGPGPAGNTSAVNINTSTAGMFNSHLSNIP,LPPPPLSLSSPPHTPATQQQITQQQQQQISQALVGMKAEKQQQAPWQGVIQTEATGGAQGGTVTTISTNGTNYPPTMASYELQIDQSQIKQEPPKKVRRLACTCPNC,SIPTDTPVRLPSFICMPFFCRNSEKGKKQHICHIADCGKIYGKTSHLRAHLRWHTGERPFVCDWLFCGKRFTRSDELQRHRRTHTGIPTQIII
###Gene_Info_Comments GLEAN3_10295 ###
Inspection of the tiling array suggests that glean may have missed the following exons: IYLFSIPYIHSHHQPCLPLLLFVCPISSRDNVAQVNIAALVGNALRSVVPSPLLITRIPSLRILFRMTSMLPLYLFFPSNPSA
###Gene_Info_Comments GLEAN3_10922 ###
Inspection of the tiling array suggests that glean may have missed the following exons: ISDSRVQQLDDNLSSQAERGTPLDSSESDSSTSKCGPCNYANDSGKKQSVAERETLDQGLSHHVYGQMDTERTKECDEKCEETLIEGIGQSSQFGGTEENLAKPLFTLVAGEPYISQCRIL,SFNPHEKPPSRKPYQCLVCTVTGCGLYRNRVWFVPQQGVVCTATGCGLYRNRVWFVPLQGGVCTATGCGSYRNRVWFVPQQGVVCTATGCGLYRYRVGFVPQQGGVCTFTGCGSYRNMVWFVS,FRIVEYNSLMTIYLHKRSGELLWIHPKVTVLQVSVALVTMPTILVKNKVSLKERHWTRGFHIMFTDRWIQKEQKNVTRSVKRL,LKALDNLPSLEEQRKILPSPCLPSLQESRTSASVEFCNKEFSNVLDVESHTCLMRPTREMIFKCSLCNKKFTQSTHLLRHATDARNHKGMKSLYQCSLCNQRFFYLSSLLKHVKLHSQRYPCLVCDLRFSSEKRLSMHSWSHRKDEPCECAVCKKTFPDAMSLAYHTRTHVGRNPYQYSACDLRFSDEKRPTRATRYERNRTGKNLFECHICNKIYLYERALTSHMETHTIKKLLCSSCGELFHDNFDLSLHMRSSHTGEKPYQCSVCSERFSKANSLLIHMKSHPAENPTNVWFVP,QGVVCIATGCGLYRNRVWFVPQQGVVCTATGCGLYRYRVGFVPQQGVVRTATGCGSYRNRVWFVPQQGVVCTVTGWGLYRNRVGFVPLQGVDRTATWCGLYRNRVWFVPLQGGVCTVTGWGLYRNRVGFVP,KATQQKTLPMFGLYRNRVWFVSQQGVVCTATGCGLYRNRVWFVPQQGVVCTVTGWGLYRNRVWFVPQQGVVRTATGCGLYRNRVWFVPLQGGVCTATGWGLYLYRVWIVPQHGVVCIVTGCGSYRYRVGFVP,QGGVCTATGWGLYRNRVWFVPLQGGVCTVTGWGLYRNRVWFVPQQGGVCTVTGCGLYRYRVWFVPLQGGVCTVTGCGLYRNRVWFLP,ESSEEGDTLHHRESQPGRQRHGYAIGHRTPNFTRATTAITSSHDVQDPPPPDSHPSSRLHLTERQSYKKSTSSSFHQTWYIL,NSLEKVQRRGIRFITGNHSREDSVTAMQLDIGLPTLQERRLQSRLAMMYKILHHQIAIPLPDYISQKGRATRSQHHLRFTRLGTSSDSYKNSFFPRTMKGWDELP,XXXXXXXXITRKDKDGDCAEEDGAIEMNLQLSDQVHVNDGCSKEDGGKEIHASQLGEVNYLTLTGYVKEEPLDSDSMEGVLNGANWGMIKPQPLGEEG
###Gene_Info_Comments GLEAN3_10924 ###
Inspection of the tiling array suggests that glean may have missed the following exons: VLPYKCKICDKGYCYKKGLSAHMRTHTAKRSHKCTVCNERFLNIKRHMKIHSGLIQCSICNQGFSNHGNLTQHRKIHRKQNR,YQCMYCDVRFSRVDTLSRHIRSHTGEKPYECSFCNKKTFSQTAHLTRHIKIHTGERPFECSICSKMFAERSHLTDHQKIHTGEKPYLCSVCEKRFG,ATNLLHLMAVALSSPSWHHELHLPWSHYPKVLPLHSTLRSGNSLHPVVMHESPCHRLPYYSHRLRVLDLIVVNSSLWHHLPLCNHLLCLHS,IHPVPVGNRFIHPVPADNGLIQPGPECNGFIHPVLAGNGFTQPVLAGNRFFYPVHVGNGFKYAREYFLAFVVRVVILATSMITLFSICT
###Gene_Info_Comments GLEAN3_11270 ###
Inspection of the tiling array suggests that glean may have missed the following exons: PLIPFRSSSPDRHLPVFPLAVFPWAIASALSSFPFPSCVGLRRGRLVGHDLVGCIYAIVNRCVGLTRQQNLIGRCHLMRLCVLRGDFLASPFRRRLQVLAHVRVLEPLDRAPEVTCLLPAVDKGHE
###Gene_Info_Comments GLEAN3_11272 ###
Inspection of the tiling array suggests that glean may have missed the following exons: GWSSDGTSHHTRDIGRAFHPCGLFGGTARKLSKKIVFHIPHTRVTFPLRETSVYEKQGSPSKNKFYHNQSNGYAQHPGVFSKRVHVDCPL,VTWCTKTFLCHFALEPVARDRTQDTWCTITFLRHFALEPVARDRTQDTWCTITFLRHFELEPVARDRTQDTWCTITFLRHFALEPVARDQTW,TAHCSKFFITLFAFVWPSVFTIGMYSQMVILQGTLCEEFPTAFGTMIQTFFRVKSHDVSIATPLGSELDIAQGTRKSLDTGMYCKLMSLKVTGIAKCFVAL,PPEEVREPMRLSIVDSHVYTDEALLTEGLITDGTLMFFSFLHLSFKHLSFLHLSFLHLSFKHLSFLHLSFKHLSLKHLSFKHPFFLHLSFLHPSFKHLSFIHLSLKHISVMHLSFLHLSFMQSHMFRKATICKKLFITLRTLRRLLIIVPSHVHREGTLSQEPDTTH,VTYTKEPFIALWAWKRLLDIFRTFSPCIMTIFTMFFSGSGGSCKTVDQDLTSSFCRVDCVRITTLLSQRAYSCLFLTFECEQWSVQSTDDICH
###Gene_Info_Comments GLEAN3_11280 ###
Inspection of the tiling array suggests that glean may have missed the following exons: PSPYPISDIHVHTRGSLNVITEKRSSLKTWEDLTPGHKAFRPAQLKAHHTGPKLTALSNAILIFITGWMFDLKSHYDGGGGGTFVISDFDHIVQCEIIYMFSC
###Gene_Info_Comments GLEAN3_11583 ###
Inspection of the tiling array suggests that glean may have missed the following exons: NLPYPFLISLCLSLICGFRVILFLPLETYRPTSVNNSGIGSYGLLENTTFEQIRRVMETNFFGAVRMTQEVIPIMKKQRSGRIINISSTTGIFGEWKDAFIIE
###Gene_Info_Comments GLEAN3_12083 ###
Inspection of the tiling array suggests that glean may have missed the following exons: CISVSVSAVSLSLSLSSSIFSNNMSLSVTLFSLTTSLFSGFYLLPSFTSSHSFLYNLYISLSLSDLPSLSPTTCHIHCIMKTYT,YKFKHISIDAYQSPSLLSLSLSLSLLLSSLIICLCLSLSSPSLHLSFQVFTFFHLLLPLIHSFIISISLSLSLTYPPSLLLHVTYIVL
###Gene_Info_Comments GLEAN3_12546 ###
Inspection of the tiling array suggests that glean may have missed the following exons: LVVHSHRRGRGSKQQCKTVHEFHSFEEEGVVRCELYGDVLEPAFRALGQTIYHLTSVPAIVLSFLIQYHTDISELAPVEVEAL,LSAIKASVLSLIDIANWLRSILRSWWCIVTEGAGGPNSSARLFMNSIPLKKRGLSVVNCMVMSWSLRSGLSARRFITSQAFLRLSSPF,IFLMGESPEYAGKCVVNLAADKDVIKKTGRVLLTMELAEEYGFTDVDGHRPMNYRQLKALALMGGHTWVGAMIPGFIKIPFWALAAFTHKF,LLDGREPRVRRQVRGQSRGRQGRHKEDGPCAPHDGACRRVRFHGRRRPQADELPSAEGAGPDGWTHLGWSYDPWLHQDSLLGPGRLYSQILIGHIFLFLIVKNMEEERPEISMCNSLLYVFLYNDRKKSYMLAMQYCQS
###Gene_Info_Comments GLEAN3_12632 ###
Inspection of the tiling array suggests that glean may have missed the following exons: CSLCSGGDSSLQSSYQSMRNSQEMSGDRSCNGARHLPEDFPLQDYLGTQSDSSERESCTSENNSYIIPSQVPYFHEKRAANRETPDATTSYHKPGGMDKERNEDIMMELDETINQSSQLCATGSVQNDPSCSLAKEKPFLCCVCSKGFALRISLSRHMTIHG,EDFPLQDYLGTQSDSSERESCTSENNSYIIPSQVPYFHEKRAADRETPDATTSYHKPGGIDKDRNEDIMMEFDETINQSSQLCATGSVQNDPSCSLAKEKPFLCCVCSKGFALRISLSRHMTIHGG
###Gene_Info_Comments GLEAN3_12772 ###
Inspection of the tiling array suggests that glean may have missed the following exons: YWVHARAQWLEFDVTSHHLATTGLVYIMKLVCVFVSAKTSLLNGKYTSEDKVHHKTQPVLMPIRNAALLDMLQVTKRLQLGRITKKVSIISSFVRLK
###Gene_Info_Comments GLEAN3_12911 ###
Inspection of the tiling array suggests that glean may have missed the following exons: KSPSRKRGRPRGKTSSSVKSPKIAFSPEPATSQDDPGHLGRGMRKRKKKLSLDEVNSEDVDDGDDDDGDDDHGDDDDEEDILDDEEHDNDSEVEDESTEEQISNEANKNTIKLKPLEFKPRRRGRPRKNERRTHRKRDRNWSTMDDVKPKNEERVKLNLVVPMKVFKKILRDRADEWQKTYDDEHSLNLFRCEVERCRGMGMPEAEFDVHMKCHVSNMEGFRCFICQFMCLHWRNMRHHYCKVHDQMLSKVTCDFEGCQKEFPKYGALRTHVTISHIKPDLVTKLSTSTSDLSEFSSYLDNVKIKKIGKEDHEEDDEDDGEEPRKQRGRPPKRKRPVGRPPKDTEGYSRRQNLQRRVHGEDREKFQRRLVAFVCEVCSAKFNEEEKLMEHSLRHYHNDNDQINCTECEAFVTAEESSLRIHMSEEHKRLLQLHRCDKCNFSSNRFHDLKKHNIVHTGAKNFMCDKCGKCTTTPYNLKVHYRRMHASDEEKKIKCISCEYRCADKAVLKVTFHL,LFSNFIGATNKIISEHVMCKHANVRPYHCNICGWSTAYSGNMWKHVDTHQKELGDKMPEFPVNVVSTENHSVPTPLRAPSGKKRGMNKASNFKLKLAKPGKTRRQQQAQKTQQMEQTATISILDDNVQTIQVQAGGNLPEGVLMQVSE,QPTCIDCLISFYPKVPIHMGPNGQMMVTKGLSEEESIGSSALSRLAAAVASAQEVHIIQGNEGLEGGGQHQEHRIIATQVCLAFCPFF
###Gene_Info_Comments GLEAN3_12912 ###
Inspection of the tiling array suggests that glean may have missed the following exons: SKSEIGNHFLTDHMEAYVSPIKEEGKTVSTNDSKEPDGEKEMEEKAGEEIGEDEENLAEEVEEKRPAPRPRGRPPKKRNQKPIKKQPYYYIEEVEVKDEGEETGEGEEEPKEFGRGMRRRKKAIPRHILRYELDDDEEFMEDEDYNENEERNVLPKKRVPTIPPIVMKGKRGRPRSSGSQEKGRQDEGSPKLKTPKIFKKTPTVSKLPLSDKVIERILDDRVNEWYLVFQEKHVLPIPCPFDGCALDVTQAELDVHLQCHAANLEGFRCPIDECNFLCPHWSNMRVHYRKTHEPTFYRLLCDLDGCAMVFPRIDKKSIHIHVTRKHLRPELHDELSSKDFNIEKYEKYISIVKAEEADLMKQVGEQAEDDEGEAGTSLGEHSNVVHVQQMSAVEQLETESDFDTENSRLKKRGRPRGSTKAAKLARIAAGEVFEKKGKRDKKGKGKRSFQRVMNNVSFFCDVCGGKYKSEQAVFDHKTLHYRDENCNVLRCTECTEYSTEESSELREHVALSHKSLLHLHRCDECQFSTNRYHDLKKHILVHSGSKDYMCDKCGTCTTTAYNLRVHWRRYHAPESEKNVKCFACDYMCADNGILKVLFFSFSF,VTRASCICIAVMSVNSQPIATMISRSTFSSIRAARTTCAISAAPVPPRLTISESTGVATTLLNRRRMSSASLVTTCVLIMAF,LFIFILQEHIRSKHGLMVYGKDFDNARPLPTYACSQCDYIGRKKSSLAYHMRIHTENRQFKCHICPYASKTKNNLLLHIRTHEGLQPLKCPECDFRGKNDKGLDNSFER,HLGNYVHIFYPFLQDHIKQHHKSMLKNPIYHNCPHCDYVGHKRQSLEFHMRIHMEQRRFKCHLCPYASKTKNHLKIHMQTHDGFQSASCPDCNFKGL
###Gene_Info_Comments GLEAN3_12913 ###
Inspection of the tiling array suggests that glean may have missed the following exons: QEIIKMSTSGDDTQFNCRLCNFVGGSKTEIAEHFLTEHIEQYVSLSKATPSKASKNETETVKETKEKEEEQKEVESDNEDVISEEQAKEEEKKSSKTLDEILKRKLEKDAGPGFGRGMRRRKKATPIKYSMDDDEEEDEWLPRKEPEPSKTIYVRQPSILNKGRGRQRKKGKRGRPPMVGFKKSKPKSAPPPPPKQPPCKTKIDIRKRSDEKNLPIPYYVFNRILDDRVESWFKNYKEKRDAIEMIRCPNDRCANVMSVDEMAVHEKCHVPNLDGFRCCECGYISLHWAKMRVHYRSDHNSKLNAATCDFEGCEKVFPCIGTKLLQSHAIKAHFKPQLLARLKSPDFDPSEYDKFMQAPENEADRQSIARKRGNKRTADVEDEEEEEEEEEEEENDTTSKPNPKKPRRKKFKRYKVHVCTICFARFKDEGEMLSHKDAHYKDNSKDIIYCTECTEYNASEEEPMRNHLATVHKRMLHLHRCDECNNFSTNHYHDLKKHLVTHTGAKNYMCELCGRRTTTPFNLRIHYRRIHASEDEKKHHCMSCDYKCADKGILKVRANE,RALTLIHQSMTSSCKPLKMRLTGSQLQENVVISALQMLKMKKKKKRKKKRKRMILLVSQIQKSLVGKSLSATRSMYALSALPDSRMKGRCYLTRMHITRTIPRILSTVQSAQNTMPLKKNPCVIT,MVNVEPAHDVQPHHVKLSADNQIIRHHPMIEYSSVPPVSVAQQIIAAGNDMAQVISEQQYHQIQQQQQQQHQQQQQQQQQQQQQQQQQQQQQHQQQQQQQQQQQQQANHGHPPQSHTPLAPPLVLHSQQPKPPLHPVPLEIHAVPIHPTTHSNIPTPVVEQIVNTDHAVHHLVAAMMPRW,STAVCHRCLSPSRSSQQAMTWPKSYQSNSTIRSNNNSSSNTSSSSNNSSNNNNSNSSNNNSSINSNNSSSSSSNSKPTMGIRHSPTHL,HGPSHIRATVPSDPTTTAAATPAAAATTAATTTTATAATTTAASTATTAAAAAATASQPWASATVPHTSSTSSCTTQPAAQTTLAPGASGDPRCTHPSYHPQ,IQVDASNGMNQQEIIEYTVPNTQQIITSSGDITEVITSDHHYHASHHISDHHQLQQQHHQQQQQQQQHQQQHGGDSSIQQAAMHAGIPTTSAAEQVPTAIVEQIVRATPHSEDQNVVHNLVASMIPHDAELVMSMMQIQHVPQVQHVQQVQQIHHVHQPNVSQ,TNRRSSNTQFQILSRSSPAVATLPRLSLAITTTTPVTTSATTISCSNSIINSSNNNSNINSNMAATAAYSRQLCTQVYPQPALPSKSRLR,MHSGATNKIVSEHIMSKHAKVRPYRCNVCGWTAAYNGNMWKHVENHQKILGDQMPEFPVSVLSNIDDLNMPTPLRAPSGKKRGQGKDPTSPSSPTSRPKAKRSRPKYAEPPSGVILANRSVVQEVPVTVTVTHVEQEQPQAQAPPPPQAHTLPL,LFSNFIGATNKIISEHVMCKHANVRPYHCNICGWSTAYSGNMWKHVDTHQKELGDKMPEFPVNVVSTENHSVPTPLRAPSGKKRGMNKASNFKLKLAKPGKTRRQQQAQKTQQMEQTATISILDDNVQTIQVQAGGNLPEGVLMQVSE,YLVTCHLIHTLYFIIPQEHIKHKHGLLLNKDSYGRPKPTPTYACTVCDYVGRKPKSLEYHSRIHKENRQFKCHLCPYASKTKNNLVLHVRTHEGLQPNKCPHCDFKG,QPTCIDCLISFYPKVPIHMGPNGQMMVTKGLSEEESIGSSALSRLAAAVASAQEVHIIQGNEGLEGGGQHQEHRIIATQVCLAFCPFF
###Gene_Info_Comments GLEAN3_12914 ###
Inspection of the tiling array suggests that glean may have missed the following exons: SKSEIGNHFLTDHMEAYVSPIKEEGKTVSTNDSKEPDGEKEMEEKAGEEIGEDEENLAEEVEEKRPAPRPRGRPPKKRNQKPIKKQPYYYIEEVEVKDEGEETGEGEEEPKEFGRGMRRRKKAIPRHILRYELDDDEEFMEDEDYNENEERNVLPKKRVPTIPPIVMKGKRGRPRSSGSQEKGRQDEGSPKLKTPKIFKKTPTVSKLPLSDKVIERILDDRVNEWYLVFQEKHVLPIPCPFDGCALDVTQAELDVHLQCHAANLEGFRCPIDECNFLCPHWSNMRVHYRKTHEPTFYRLLCDLDGCAMVFPRIDKKSIHIHVTRKHLRPELHDELSSKDFNIEKYEKYISIVKAEEADLMKQVGEQAEDDEGEAGTSLGEHSNVVHVQQMSAVEQLETESDFDTENSRLKKRGRPRGSTKAAKLARIAAGEVFEKKGKRDKKGKGKRSFQRVMNNVSFFCDVCGGKYKSEQAVFDHKTLHYRDENCNVLRCTECTEYSTEESSELREHVALSHKSLLHLHRCDECQFSTNRYHDLKKHILVHSGSKDYMCDKCGTCTTTAYNLRVHWRRYHAPESEKNVKCFACDYMCADNGILKVLFFSFSF,VTRASCICIAVMSVNSQPIATMISRSTFSSIRAARTTCAISAAPVPPRLTISESTGVATTLLNRRRMSSASLVTTCVLIMAF,TVLDGQTITLAEGISEEAAMGASALSRLSQGGEITVREVHFLQGGDNQQQHHEMITYNVPPVAVTQQIIAGGDMGHVISEAHYQQQQQQLQQEHEVEQHHYVEAGLQTVQVVTSHQDNGVPRAVTEQVVHAMPHPQDHHQDHHQDHRGADQNQDQPVVHQLIPMTLPHEAELVMSMMQAHQASISQSQ,SRDGTGGARHAAPSRPPPGSSSGSQRCRSKPGSACRAPAHSDDSATRGRASHEHDAGTPSKYLSVTVTYHQEICGSRGGAGAMRPFI,LFIFILQEHIRSKHGLMVYGKDFDNARPLPTYACSQCDYIGRKKSSLAYHMRIHTENRQFKCHICPYASKTKNNLLLHIRTHEGLQPLKCPECDFRGKNDKGLDNSFER
###Gene_Info_Comments GLEAN3_13406 ###
Inspection of the tiling array suggests that glean may have missed the following exons: FLSFSLSLSLSLHAPPLSLPLYNLFXXXXXXXXXXXXXXXXXESSEVLKEQAASCCNSNISGERNFAQLDSHLHHAPNIGIGKIESKVMFKANATRHWLQNKPQGSRKELIRNNIKAGAKERKREMENKVTYREKVKRRVREKQQKLTEKAEKARNKVEELIESILQDGVIEHRDECETKVNGMSKTRATGMLKAQIQFRTKILGQDIGKHALSKCSVDELKNILLSIPEPTDKNFLTDVKDPGQIINREFAQKWEKDEKEQWYNATVINLKDGEFEVSYQGNNELFYMTIAEFFTDIHL
###Gene_Info_Comments GLEAN3_13407 ###
Inspection of the tiling array suggests that glean may have missed the following exons: YHLITFLILWSFLRSYPSHNTRSFNTRSSNMKPADAGVRRGAILVLILPHPPHQLKHQANPSRLLNPSLQSPSLQKQLQASPFW,TKFGWMGFPQAAPINEESRLLYLVIQLMKNHPAVSSLSPTKTATNIRLRYKTICDRIMDDPLLSTLNLPLPNINSKSITNFISKQETKYNLMSTAQPKVVSHRRVISFDNIPDPVELPEVIPKPQYQELQYKIVQHEAGRRRGEKRRHPGSDPSSSTTPTEASGEPITSPKPIAPKPIAPKAAAGIPILVV
###Gene_Info_Comments GLEAN3_14197 ###
Inspection of the tiling array suggests that glean may have missed the following exons: TPFLLHLEISHYLLSLPSFLLPSLVLTDFFCTLYLLSPFYLFSFLPGSHILFLACFLSLSLSLSLSLSLPLYVYFSLLTHHPSSLSILHELRIMRICSNPEFSLLSA,CSLQVALHVSSLSSRLSFSLPLSLSLSPSLHFSDSLPLRGYYVVCGFANELELPGRCAAMYKCHWPLRTHTHTRCIHLQHTLALSHTHNTHTHSELSAHTAGKGGLIHAYRTKGLCS,CSCSHRALHSKLLLVSCASSSMLKIKPLLMILPTHSPCYYLYCHNTCHLVKICNSWFLFCAHPNICFRSLSSSPARFVVNIQVMNLYFREKSHSLLMSIPHPSDI,TGVGCMLLADYAIHDRAVHQSDRGTVFLHLDHNSRNCTTPLTTNCWKCIISLKEKEREGKNTLPTQHSNKSRKKRGTVITDFDWVCMYDIGSAPPIHGAGSVTHSRHLNTVHAWSSNDNEER
###Gene_Info_Comments GLEAN3_14643 ###
Inspection of the tiling array suggests that glean may have missed the following exons: NIWDNSHIELSCTPDNATYVLLYSHQTLYDPTTSKFRKSESSDIKSSTISEVIILKCTDWDLSIFKLCVDIWDHLLTMNWSVSNFHKYE
###Gene_Info_Comments GLEAN3_14684 ###
Inspection of the tiling array suggests that glean may have missed the following exons: LNFSDAQKFQCSLCDGLFTSAKLILRHIRCEHNSGDEMIPVLTWKKKKKKKYVAIEVSSKQQLATQIKVSDQEESAFKCGTCKKVFPSFGRLMAHELFHEKEQASPVNVKDSLTVTKQMKPQGSKQYVCSECTKEYKSWRSLNRHEREAHGYRCDFCLERFPKKKDCLTHEQTHQAFKSSQPAGKSKASPTKSRAASTGIQPSEPAPDEPKDMLGRPTDYYKRPYKCRFCTKRYSSRGTAERHEKEVHKGEGDFKCSYCTKVFATVSRLKDHLVLHKYVNMYRCTECPRSFASESALNNHQGEHTGLKPFKCEVCGRGFRTRKLALKHKRRIHQERPKRFLCTFCDKGFADKSDWKVHERRHKGIRQYVCLECGKGFTSSTSLAAHKQAMHIKVKPFSCAVCSKSFALNHQYNHHMAKHRLEGEGNALASMQQS,SGPVHSSANHLHHAHLQRPVAFGGHGVKFSSLPASYTPPITAQEPAVERNDVPVSLTTCVPVVERTDLPTTTRESVIDIPPISMRESPTDSCIPPTTKQELAEDSYMSPSTAHESPIDSSISPTTTQESVEDSSIPATTTSEQVTDSINIEPVATSEVAMDDDIPPAKSWGTVEERSIRLVTTHKPQAEGNTLRAQGEDSL,LGQSSIKSAPSLQSTDHRKQCLPSSKPFHQANASPPAPDVSAAEPPLFAALNLTKTISVMDLPLSLSLRVASQDKVEGVVAKDTVEKGVEFGPYTGTLLDEEQGSSKETTWEV,VSHLLNQLLHYKAPTIESSVYLRQSRFIKLMQVLRPQTSLLPNLLFLQRSISQRQSPSWIFHCLSHYELHRRTRSKEWLPRIQLRRGWSLDPTQEHCWMRSRDRLRRQPGRY,ANIIHQTPPQPPVTLPVHIRGHGVKSSTLSANYAPPINTTHGAVEEERNDLQIATHGSVEASKMLPLAFHKSVVERNFQPTTSYESGLHVESNALPIITCKSA,TRCPGVRDRTLVVTQRSVNKLFVARNCHSFTLKKPRQFILWQFTAPYLNKPLLSLSLSLSPSLYATSLNLENELSSASTDSNLTLYH
###Gene_Info_Comments GLEAN3_14685 ###
Inspection of the tiling array suggests that glean may have missed the following exons: SPVISDAQKFQCSHCEGLFSSAKLILRHIRCEHSDGEPCEMMPALAWKRKGKKKGREKSVAIKFKFNHPINHPIVRKRKNSEEEEECDFRCGTCVKSFPSLGRLKEHELFHEMMHGDKPYECSECNQRYTAQSSLNRHEREVHGFLDDYKPRSRPKRLKAHVPKKPLHCRYCGQGYKSRGALANHERRIHGSRHPIREPDLPNDEPKDMLGRPSDYYQRPFKCRFCPKRYVSWTTVEQHEKEVHTREGTFKCSHCPKVCASESRLKEHLVVHKYMHMHRCTLCPRSFASESALNNHQGEHTGLKPFKCEICSRGFRTRKLTLKHKQRMHQERPKRYICSICNKGFAEKCNLKVHERRHKGIRQFVCLECGKGFTARFSLTAHMQAMHIKERPFACEICGKSFALNHHYNHHMAKHRLDGDDSIPQ,RRMYRKSHFTVVTVAKGTNHAVHSRTTRGESMALGTRFGNRTYRTMSPRICLVDPLITTSDPSSADFVQRDTFPGQRLNNTRRRSTREKALSSAVIVPRFAPVRAV,SVKEIQTIKQREQCSSSSHQASASSSSSDTSNPTPNTSKDESQLLAALNLKKTKSIQDLPQNLLFRATPEGKVDGVVAKERIEKGVEFGPYAGTLLDEEQGWTRDTTWEVRRAVFHKTVF,FPLDSAHGVSNAGIIHQARQQLPVHLRGHGVKSSTLSANYAPPITTHEPIRERNDLPITTHESVSSIIQPLTTPESGAKSNVPRPQGTVCNFCLVGFC,MKKHEPKFYRCKKCNQKCKTKTALNKHEREVHGHQCRFCSERFFKKSECMKHEQTHQAFKSLKPAVKKHESLSKTQASSPTLIHQPSEPSPSEPKDMLGKSTNYY,TRCPGVRDRTLVVTQRSVNKLFVARNCHSFTLKKPRQFILWQFTAPYLNKPLLSLSLSLSPSLYATSLNLENELSSASTDSNLTLYH
###Gene_Info_Comments GLEAN3_14793 ###
Inspection of the tiling array suggests that glean may have missed the following exons: RRVDNCGPVCGAALSILIIMRIHDFISSLCGSLSLKLSSPFLMRNYLSALTLQSSFDDMAEAFTIFLFCHIYWYKFPYIVDCKNVPT
###Gene_Info_Comments GLEAN3_15071 ###
Inspection of the tiling array suggests that glean may have missed the following exons: STTGFPSWSIPVPLERSGWTKAMQSCFFFIMRRGDYGPMVLIPMLFLENKKQIIFDSPRLEGETGKVMRSLWPWTNHVVTI,STTGFPSWSALVLMERRGWRSFSVQVIHCHRVGEAMINNWFPFMEHPCTIGEIRMDKSNAIMFFFYHAEGRLWTNGTNTNALLRKQKTNHL,LVFTEPTPPPPTPTPPPKSPTPPPKEPTPPPPKPKPKRKAIVKKTKAVPPPPPPKTPTPQPPTPKPPTPKIPTPQPPTPTPTPPKDPTPPPPSPPPKVTLGKFMCTCFVLFKIQIYI,LFLQNQHHHHQHRLLHQSHLPHPQRSLHLPLPSLNPSVRQSSRRPKQSHHPLHQRPPPLNLPPPSHQPPKYQLLSHPPLPQHPLKTPPLPLLHRHRKSHLVSSCVHVLFFLKFKSIL,LYDDRCVHDEHHKYRQDEFQKDGYRLKHSPEEGVREKCEDTSVVLEMGKEARRELKQESDSPWESCYELHASGVRLTPEADGEENRTKTLKCH
###Gene_Info_Comments GLEAN3_15137 ###
Inspection of the tiling array suggests that glean may have missed the following exons: CLGSPLSNFTMMLKSRMCLGYLQSWKEKKSELAFNQLFFQSSIFNKEHLLIQLSLTLLPLVSILAKWLLRHCRLINFSNP,LNYNFQQNILLTWVKHFTTSPLPPPPLPLSHTLSSAPPLPSPPPQCPPQHSNAGATPPPPPPSPPLPLSHTLSSAPPLPSPPQCPPQHSNAGATPLPPPSP,NILQLHLFLHRLFLYPILSPRLLLFHLLLLSVLLSTPMQVPLLLLLLPRHPFLYPILSPRLLLFPLLLSVLLSTPMQVPLLFLLPR,FSTKHIANMGKTFYNFTSSSTASSSIPYSLLGSSSSISSSSVSSSALQCRCHSSSSSSLATPSSIPYSLLGSSSSLSSSVSSSALQCRCHSSSSSLA
###Gene_Info_Comments GLEAN3_15358 ###
Inspection of the tiling array suggests that glean may have missed the following exons: STASWRTESVSALTLMTSSDLRAWQCGSTVLSMSDVRAVLALLHHSASSSRCSKGVLFKTLPYSPEVKLALSDIMKTRSFRR,PIIFFHFFVRLSPTAYHSNFCEGYCPFPLDSHFNGTNHAAVQAILHTRKMKRRDGRRIPSPCCVPNSFTGLSVLYLNEEKNVVIKDFEQMVATSCGCH,RVLPLPPRLSLQRDQSCCRTSYSTYQENETQRWKTDSKSLLCTQQFHWSLGALFERGEERGHQGLRTNGCYKLRMSLMSVMQPRDRHRLWFHQ,LHGLDNQADDVSDGLLIIVAIASLIVLWSLSLVRFLFLSLSFGSSEFGLVISMVPHLEVLGWYEGCDEESRHDDQAQIQPAQMENFKRKTRSM,AGTGGCIVGAGSTPSEAAGTGLVAEGAACNGVGACGAVARLVLGAVRVLGAPYGFSVTLVVVAGAGGVGLVLGGAVAVLGAWLMMVVVVAAPGVRNWVCC,PSPHPFSSSPLFHVSLSLSTSPLFPFPLLSLTLLLYISLHLSFTLAAPLSKYIHNPLLNRPPSSLPAFMPFISFFPLVSPSRSLHYPISPSSLSSSSSIIFPLYILLFPKEEPC,SKKPNPGFYENILLFVIDPKPVLTFSSPLFFLSSLPCLTVSVYLSFIPLPPVIPYPLALYFSPSFLHFSCTSLKIYTQSSA,KPAFSMIHTHTHTHAPESHTQTHAHILQYSKPKLSMYVNTPLPASIQHAPLRTYSSLHRTPRKQAVLYIEAFDLKINRSRTEM,CDGLVLLVAPPFSSGIVHLKEGINGHGSLLLGLIATDNLYVFTQSDRNDATLLGARILCRGNKVSGNESQRSHMRNENTILFFLFSILKT,VHSKVNQVGLYHYRFLPAYRTHQTGLPFNNRCWLSSNDISRHLTQFNPHTITMHESGKPTSAMDRKARFTQRTTPDSQLASKVCIGFIARVVN,INRSRSTVKLIKLVCIIIGFYQHIEHIKRAYHLITGAGYHQMISPVTLLSSIHTLSPCTKVVSLQARWIGKRDSHNAPHQTHN,EPRSPPRSSTKGLHPRSQQGFEQTQTLFILPCRLNILNITIIDFDVDDGAYDDEHDDSADHDHGGVQWTSDDVDVDEAMSRSSMLQSTLK,GNLTLTGYKDEKSFKRSSSSTTKTSTLGRQRKTSSAPVETIRLMTSLKVHKFEMKTKKKVENALIIVNGSHPKFFRPDNK,RLRRRWKMLSSSSMDRILNSSVQITSELIIKIYCVYSLCLPCPLPLCPSLSFSLSLSLSLSLSLSYNLSYLLLPQFIYMTQSLLYV,WLDGKKKRNTLFLVFSKQYTLSTTTYCRKKKFLAKGVHSVQHMTVFRVHNHPLNMKQKHFFYAINLTIRMREKQTVKQGSISRTNH,LNDFMFQQERDSGISASFTRSTCESERSSSSSMSSEELNDLSPQRFFPTTVSFDLEDDDHNTSPVPSPTLREPSGRQANSIVRRSTFYRNSEGNTQNIY,AYYTTAKTEVTNGIPYLYDDDVLVLGRLLYFVKCGVFSSRCSHEGSQNHDDQLHHSTCYMHCYSNCIPAKICPTARSQWST,SLFHLSSLLSLFSLFSLSPSSHFLLYFLLSVSLFLIYKVYIQTEIIISNNYVCKERRRGTECGLPLQLCGSCLVLVQLHVHVFHLYSN,FSSKGYIFPQNHIHFVPLCIGTKLDVEGSIDIYCGGYISISYFYDIDRIFCLNELHMNIVLAKQMHSLLSLFVFDRLTRY
###Gene_Info_Comments GLEAN3_15640 ###
Inspection of the tiling array suggests that glean may have missed the following exons: LIHQKEYGMKSKSWRANFHLQHQMSSSTVHVSVCPNFLYRLLINRRKCKRRAHHPAKGHDEMLSIGSNKYHRVFLESSSTRKGERF,SLVYRHTWFTLLRVVCSKYFNINLVHSQLFYHSSRTFVLLFLPINSSMIYIFFLKTASPFPLSAQALSAAPFSTTWPLLGPRGSLQSCPQSEMFH,SELVNSLERQPTSIALYLMCYLGGGMSSKGYRLRSVRRLHVSRAQERKKRPVRFLCNLSPYRVAVLMTIFGHLVAWQSWRGHHLFSYHCAPILFISFHLSLSLSLSLS,SHCLSASLSLSLFIFLFYPPQSLVTSPSCCPSLHFFKSLNLSSHLPLFSTFSPSLPISSLSFSLVISLISSLDLSCKIYPGHFSFLPPQLYLIPFIMYLYLYFIITFVF,KPTRPIFPHSRFALFPYGINQCDAATASTSLSFPLALFTPLSIYLYHHSFFLEGERKKNKSRHTEARPGFSPKLGSEILKQ,SARNRKQNLTVTTRHLDYFSYDNQLPLTTPAPPFSISLSLSLSLSLSLSFYFFLPLSLSPSLSSSSPSPLLLCNHPFLLFSP,STPSYHPRSPFLYLSLSLSFSLSLSLFLFLSPPLSLSLSLFFFSLPSTPLQPPFSPLFSIILFLSNFLSLSFHNASAISLPVLSSN
###Gene_Info_Comments GLEAN3_15688 ###
Inspection of the tiling array suggests that glean may have missed the following exons: LAYSNSMLPTCRHHPFSDLNANFSHLSFIDFKFLVQNAIHNQRHIGITCYLINTDNYFVFIPPNQAILAPVCKFHQSSSTG
###Gene_Info_Comments GLEAN3_15772 ###
Inspection of the tiling array suggests that glean may have missed the following exons: PLLILPKTDFEFPVSSPEKNSMSKRQRMHMVIESTCYDQFVVEPIRSGKRGFFTYLVGLIFVRIVDDIFSHLSLPLFYLVLLLQ,IYQVSISVEFVSPTYLHVFICFPLLSLALSIVYTNATSFSMFVCFHFLNLFSPLLFVFSSSLFKTKSFLIFSPLLLPPSILCSVFILISSPSLFNSLLDINLHLSISLSPLTLPPSLPHAHCVTHFPFFTRNEFKEGYFLA,TCLYLVIKVPPTLASMCSSKVPTLASMCSSKVESYFGNRKHIWGKERDVMNLPSIYFCRICLPNLSTCIYMFSSTLFSLVHCLYQCNLIFYVCLFSFFKFVFSFVVCVLLISL
###Gene_Info_Comments GLEAN3_16250 ###
Inspection of the tiling array suggests that glean may have missed the following exons: RSCLPNSSQSKSSTLELSARTFMYLNTDTMPFLFKGLIRPILEYGQAAWSPYRLGEQRILESVQRRATKIIPGLRNLSYQERLTQLQLPTLIHRRIRGDMIDVY
###Gene_Info_Comments GLEAN3_16490 ###
Inspection of the tiling array suggests that glean may have missed the following exons: PTSVRVCGMKKLTQTWKQQKVSKLRDSSFTTASSKTQLVPSQDGGKISQGSTGSHGATSAAPTPSASGGSRKRLGQHIPRQAVPVSVPKPVHDGAEGEKREKENEEDLAKKRKVGGVIGDQSKRRSPRLQKGRDPSR
###Gene_Info_Comments GLEAN3_16518 ###
Inspection of the tiling array suggests that glean may have missed the following exons: EDSELSYLENFSPVSHQIVLKVIRSHKIKSCSLDPLPASVFSRCIDCLLPAITDIINDSLKAGVCPEPLKTALIVPTLKKSSLDPENLKNYRPISNLSFISKVIERVICTQLMTYLASNSLLASRQSAYRENHSVETVLLRVQNDLLLSLDSGNEALLVLLDLTSAFDTVDHQLLSRLEKCYGISGTAAKWFESYLSGRKQQVIIDGITSDPALLRWGVPQRSVIGPLLFICFTTLIQDIIHSHGFTSMMYADDKQLYITVKPSVINHITQKLDICLQEIRLWMQHNFLFLTIVKWQFFTCLLNLENRMSCYQYLLIIRQFIVPRQFAISELFWTTICLCAATLILCAAKHHLLLEE,DLLIVREDDESVEEVKVIHSMSSDHAAIAFTLCESDIVQSFASQTSLDEMDINQKVDMYNKTLFSFLDRHAPEMRQNVQLRPHAPWYNITLKELKQNLRAKERIWRKSKHRSPVMKDELQNSTKEYFSTLKRFRREHHRQTISSASTQELYKEIDNMTIEKAKAVLPTHKSKSDMVIFFILHSSKTKYAD,CNQPHNTETGYMSSGNPFVDATQLSLLNDSKMAILHLSSKFRKSNELLPISVNNTPVHCSKTVRNLGVILDNHLSLRSHINTVCRKASFALRRIGKVRRFFNKASTEILIHSFVSSLLDNCNSLLIGICDKDVNKLQRIQNSAARLVSLQKKCQHITPILKDLHWLPVKFRIQFKIVLLTFKSLNDLSPEYQSDLVLQYVPSKS,KQTEISDFIITENLDVLAITEAWLTGDSRDSTTIADIQNTLQDFKFLKLPRKGKRGGGICVILRNLFDCKARPYSFVTFECLEVTIRSTHKDTVSLFAIYRPPVCPRSVPVSHFFTEFS,TALCSKEFSLGQWCYSSYSLLLLLCCYVEAMVMVFTLTPVHVLTTVSKSESSSLLPHRKIYGSPIPYYSNSTSTFQLQLLDCGDVNPNPGPHTERSTTHSSTPIYHGVNSYHTPNRKYDIPFLKSLNLLSRNDPQV
###Gene_Info_Comments GLEAN3_17348 ###
Inspection of the tiling array suggests that glean may have missed the following exons: CNRNHSLDGSFPLALSLALSPSSPSPAPPPLLPLPSTSLSLFISLIPPPYLSFPSPCLFLSSSLSPVLPPSPLSLSLSLSPTYLSRTS
###Gene_Info_Comments GLEAN3_17427 ###
Inspection of the tiling array suggests that glean may have missed the following exons: PPSSLPSSILSSILSPSLSLSLFLFLPLSLSPTPFVSPSPLLSHSFLYSSSFTLPHLHMILSCPSESHCFHFLFFSFLSFSSITPSLPSSSILSFSLSLSHPLCVSLSLIQSLFSLLQHIYPPPPPSTLIC,LCEDRIWTCMILSCPSENHFVFIFYFLFFILLFYNPLPPFPPPFSPPFSLPLSPSLSFSFSLFLSPPPPLCLHLPYSVTLFSTRAHLPSPIYI,FFLAHLKTTSFSFSIFSFLSFSSITPFLPSLLHSLLHSLSLSLPLSLSLSPSFSLPHPLCVSISLTQSLFSLLELIYPPPSTYDSFLPI,NRKKISYMATMGAPNNSYRYTCSWACPRYLAMWCPNKILWIYEHNFTSNLSFYIVSCLTLDPVELLINVKAKRFGTEHLITNSKMILQISRSKK,DCTESSLLTVPLSRVDSFFVVLEALTFVLLLDNSDCTESSLLDVPLSRVDSFFAVLEALTFALLLDNLDCAVLSLLLEILIDFVFFVVLSFKGSQSSTFLTIGNLLDFVLLLL
###Gene_Info_Comments GLEAN3_17656 ###
Inspection of the tiling array suggests that glean may have missed the following exons: IFGSFHHHHCFPLHHYMSSSFLRHFWISESYRHHHHYHHHLFLLLLLHYYQWLISFSFSSVAGYFEPANLLHYSHAHYHYPYLPS,LSIFDPLLFPGVLLHFWITSSSSPSSSSSSSSSLSSSSLLLSIFDLLLLLGVFVDFWIIPSSPLLSSSSLHVFFFPEAFLDF
###Gene_Info_Comments GLEAN3_17750 ###
Inspection of the tiling array suggests that glean may have missed the following exons: YLERSKFTLSLLSLILLLFIARCFLGRCRLRGLVQTGLSSRQMTEAMNHVQDWYNDPRSIHANEVEPEVERVTILAMGKSITELGHEENDSTCNEHIL,DLFSYSCERSSYQKVASLKTASSIYIIYRNTNGPLSSNSYRADICTLRKDNCLKLSFTGITSKLQNADCTRPLTKNIESTIHKTCTWKIFTVR
###Gene_Info_Comments GLEAN3_17811 ###
Inspection of the tiling array suggests that glean may have missed the following exons: GFRNQVAKILSISQQCSCAPARGQQMMCLLHTILETPGIVHITSCTCAVSERSLFIYLFLKYIPLFIIAAKVPSRCIWLAKCRDMPGEHEALFRLRNPLL,IQMEFLRCYCVMYVHENGRIFLIIDQNSLCYFILFVHSNTHLGLPQSGCQNFKHFAAMFVRTCAWPADDVFTTHNTRNSRHCAYNQLHLRSFRTFSIHLFISKIHSAVHYCC,TVYVLRSRREHMSCFAYGCGWFVFTVGKKIFIAYVSLVILCVVYFLFSLMFTHVVVYCERGNDTKYEGHILGRYVQFVTRWRQDDRVVSQ
###Gene_Info_Comments GLEAN3_17846 ###
Inspection of the tiling array suggests that glean may have missed the following exons: ESPVNPIELWKKGPIVARSVVKTDEDVLPVRVFNPTREQRNIKAGTAIASLSAVEEVGTEMCNQTMPTETSASRNMPPDKQGNKDVDARRTTTHFFQCDSCTKSYEKKDSLQRHKREVHILKYRCDQCDYRTGRKTEMERHQGAHEVAVVPVRKHSLTPVSSPRCKPTSTPITGTSEKADASWRHVRMDWRSPPRDDPRLDRPRGRSPRRYRSSRRKSKSPQRDLTKSSCQKTPESPQHDGSEAS,RCVNCLSGCSIQQGNMGSQRQALPLPHCQRWKKWAQKCAIRRCPLKPVQAETCHPTSRETRTLMPEVRLPTSFNVTLVPKVMRRKTL,DFLVQINAVLDCQKMELRTEWGIIPCLDSEGESFCRRIVAGEEYSIPPGHEMVLANRVTGEKIALCEGLVESPVNPTELWKKGPIVARSVVKADEDVLIACQGVQFNKGTWDHKGRHCHCLIVSAGRSGHRNVQSDDAH,LPVRVFNSTREHGITKAGTAIASLSALEEVGTEMCNQTMPIETSASRNMPPDKQGDKDVDARGTTTHFFQCDSCTKSYEKKDSL,WCQKWDPVTMAQGRNRQNRPTPGEMMKHLERSMTEVDQMMEQLRASMSQVPVADPEPGRQEAFPPTLSGQTASGPTADSQMPAAQRVRFSSTPREQSRRLPSAGTGVHITPPLFSPP
###Gene_Info_Comments GLEAN3_18373 ###
Inspection of the tiling array suggests that glean may have missed the following exons: LKMKDKVSRMETLLVELHKVVVERSKQTTVRGQPYEGNESLILIGDSEHAVHVDKRRFQRAVHLATSVHLTKQTTERGQPYKAVEGNESLILIGNSEYAVHVDKRRFQRAVHLANSGRALLLKLMSMVIQPDELGNFSYRGDRNMEPPLDSLIDDDRFKAIQLQVRKSFPCFDAPKNLRRIRDAVNGKCRKLRRISSPS
###Gene_Info_Comments GLEAN3_18619 ###
Inspection of the tiling array suggests that glean may have missed the following exons: MTKYFLFISLFMACFMVTTCHDLRVCICTPIDLLSPSCPHICNSQLIVKIICTNNCQSLVNQSPYHFVLPEMLQLQAAEPH
###Gene_Info_Comments GLEAN3_18812 ###
Inspection of the tiling array suggests that glean may have missed the following exons: RVLSKKYHYNPLRIHRGETLLKVCTLYIVIRESLPKDNITCHLRIRYIQAREKQFGFGAGQEFQGAHCFLPPPPSTPGTSAYI
###Gene_Info_Comments GLEAN3_18850 ###
Inspection of the tiling array suggests that glean may have missed the following exons: VRDIYINTQTCTYQVASDHINNMSVSKNIYSSSTDVCTFAVQNFLYIKQKQLESGLRKFYIVSVCYLVLLLNLISCKLYIDSL
###Gene_Info_Comments GLEAN3_19435 ###
Inspection of the tiling array suggests that glean may have missed the following exons: YCEEVYMLERNEYALCRLCELCKPLQSFIVYGSSWRSCYFGNPHGSCPQNICQIFLDFRRCCILVSCKMIRYVYCRHEEYQSL
###Gene_Info_Comments GLEAN3_19651 ###
Inspection of the tiling array suggests that glean may have missed the following exons: QTFVNTEDIGRYPSLRGPPAHDTSSGCTDGTSCRNGDTCKVADDQSKRSAFQAYQREMPKLLLLAYLKLSPFRLHPLPYQRL,RSFVRESFVAEGAEVAQLSSLMHPAFMSTYPFADTESLPTDLTLERFEACMFPLVIGEGAIRRKGSWAERAAVETVLLVTLVMLKEVAFFHEDTRAQVTLE,EHQPMVFACIVNAAILVLLKNCYKSQLIMAQLIITPAHSSFILHDEPLFISDISCHQLYIFTIPCLNVHLSLYCSPWHVSHNCDCLISVDSISL
###Gene_Info_Comments GLEAN3_19652 ###
Inspection of the tiling array suggests that glean may have missed the following exons: GFSFSWTARTCNCSECLAVNPFPHIPHSCGRIPVCRSACFFMSHCLANLLLQNEHGNLSSTTLCTLPMCAFTSSFLPKPFPHTSHLNGFRPVCSPWWFFRAFSDAKERGHSVHL,SGFLPSISLLEVSPLSSCTTSMFCTTLLGLLTLFNFIRLWVVLFSGLTSSWIVSSLCGALSWFLSCSFSCSLIPCFHLNCKGHMLQV,VLLSIFLFNSLEDFDDSLPSRVCLWVLWLVEGGMWLVEGGLWLVEDGLWLVEDGLWLVEDGIWLVEDGIWLVEDGIWLVEDGIWLVEGGISLEFLLSCSVR,FFPPEALPAYVALKWFQACVFPMVVLQGILGCEGTWTFCTLVEFICNVMFLLVNFITTAIHKLLWTIQALISPPAFVDLSLMGLHTANRGVSLRAIFTLESKRWILEIIRLLAKHILVGSIPIIFMYNIHVLYHSSWFVDFVQLYTFVGRPVLRVDEFLDREFSLWRPLLVPLMLLLVFSHPMFPSELQRAHATSVVLVRLPVNFLFMAIQGFG,EVCLRTLLALVDHSLMASHVSHKLMLTGLCFSADWAIVFSCECVKHHVCLKPALCGERLTTSHANKIFPRLGGRLHIRFHFMSSTFNILVQFPGRL,TRRKPKLKRNKPSSESIFFNTGQNSKGSCSHDPTQFDANACRAALNGACAQLAINQNKQTSVDFMGFPIQVVEMVRFFYNVNGPKDQLACSKTHLLLLL
###Gene_Info_Comments GLEAN3_19653 ###
Inspection of the tiling array suggests that glean may have missed the following exons: YFNLNKSHTSISSSLCLFTGFTLLLLLWAAVQIFCPYTVRSGAIASGLMFCPFDVLSLTIASVLVFCVPMVLFVTVAIPLFLTLSLFDP,EAGGNFVRPGSLDSVSGLLLDSMILPLALPSSFPRCFVISSMIPEDKMRTVSPLLLSGLVCPKPFDAPELRILEESCGRPSLVLAVL,PKMTVTSTPHPLPPVHLPAVHTVYQQGSPPQRSLQRSSGLHHDEMRKTCTRIPPVMVEHPSLLLLHPPSPSRHHPTPDSYHPHQRPHRSPLALPPPHNHYYNLHLSLTHLYLTLDCHHHHHHRRHCHPPPPLPSPPCLFFPSPLPPPHHTTLHPPPIHLHPPAVMRHNGTSPQHLSGNNNNKI
###Gene_Info_Comments GLEAN3_19824 ###
Inspection of the tiling array suggests that glean may have missed the following exons: NCQVSFSTEASPAGLARKRPDPGVGAEVISQIRACAELLVTYPALVRLLACMHAYMLDEVTLRCVSLGTYSAAVGSLPGMTTHVEGEASLCTAGFVAQLAGEWTFTCMNSHVNCQVSFTTEAPPAGLARKRLDPGVGAEVISQVGRLAERFLADGACEWLLTRMATHVAFKVASFDKALVADTAFEWSL,CAMVLWPLARCSFQRICLFVLFGYKIYLTVNSNSRFTNHSHTLSRGFYIQTLPAIYLNHVIGWRSWICIRYMLALKLSCTQTIFHQVISIRSL
###Gene_Info_Comments GLEAN3_19825 ###
Inspection of the tiling array suggests that glean may have missed the following exons: AEFFQTYITFVRFLVGVGSHVPGQIRACAELLVAYAALVRLLACMHADVLDEVALRFKRLWTCSAAIRSLPSMATHVKGQVALRTAGLVAQLAGEWTFTCMNSHVNCHVSFTTKAPPAGLARKGPDPGVGA
###Gene_Info_Comments GLEAN3_20311 ###
Inspection of the tiling array suggests that glean may have missed the following exons: STNGIHNIAYAQCRKQLRTPKCDEVARSMPKRCGRTEGSCSVAQVSLEPRERSNRPRFQTNKQHHVVRIILKTEHGPIISIFYRFSKVFRSEFVECI,LCKLFFVKCLFEVQMVFTISLMRNAESSFGRRSVTKLLEACRKDAVEPRVRVASLKCRLSLANVPTDQDFRPINSTMLSE
###Gene_Info_Comments GLEAN3_20481 ###
Inspection of the tiling array suggests that glean may have missed the following exons: MARVFAFACPSAHTGQACSWLFSWNNSCAFSLPQEGKIFPQVPHLKTNSFLSSLSFMSLSSFSFLTCCLMVNSVSAGFFFFFLYVRTGIISSGIRLVCSHLMCLRISLADVNRPSHWLHWNF,RLESLVCLLMLHTFTLLKKPFRTEPALMPMDFPLMFIEGSLCFALLVAFLTPVELGFMFFHVNGKGLCIRLSFSTHRTSMFLAVFMEQLVCLQPTAGRKDLPTGSTLENQLFLVVTVIHVTV,KHEREVHGHQCRFCSERFFKKSECMKHEQTHQAFKSLKPAAKKHESLSKTQASPPTIHQPSEPSPSEPRDMLGKSTNYYLQQRPFKCRFCPKRYVLRKKVNEHEKECHTGEAAFKCTHCPKIFTSKAAMMIHMKCHEQHRMYRCTLCPRSFASESALNNHQGEHTGLKPFKCEVCGKGFRVKKAVYAHRRRMHQERPKRFFCSVCDKGFADKANLVKHERRHKGIRPY,RKKSQCKEQSAPATNSSSSDTAQLLVALNLKKNAIISSSQELPKGLIFKVASEDKVEGVVAKDTMEKGVEFGPYTGTLLDEEQGWSKDTSWEVGRNSGNNNIDRMFL,VRNHNVKNKVLQPQTPHPRIRPSFSWLSISKRMQSYPRVKNFQRASSSRWLRRTRWREWLPRIQWRKGWSLDHTQEHYLMRSRDGPRTHHGR,ISGTQKFQCSQCEGLFTSAKLILRHIRCEHTSRIPDEMIPVLTYKKKKKKPAETELTIKQHVRKEKLDSDMNDSDDTKELVFKCGTCGKIFPSCGRLKAHELFHENSQEHACP
###Gene_Info_Comments GLEAN3_21205 ###
Inspection of the tiling array suggests that glean may have missed the following exons: PRPFYYLLHILSPPVCFFLPLFISSPPPPPLSVSLSAYISPPSPFPSASRSFSLSPLLPSLSAYQSLLSPLPSASSSLSEAPLLLLLLSLSLSPLSPPIYLHPIPSRLLLTPSLSLLLLLLLSLSLSPPISIRFFLSRPHSL,FDHALSTTCSTSSPLPSASSSLSLSPLLLLLLSLSLSPPIYLHPLPFRLLLAPSLYLLSFPLSPRINLYSLLSRLLLPPSLKLLSSSSSSLSLSPRSLRLYISTRSPPVCFSLPLFLSSSSSFSLSLSLRLSLSASSSLVPIPYKTQFT,LFLPRTIDLTTPFLLLAPHPLPSRLLLPPSLYLLSSSSSSLCLSLRLYISTLSLSVCFSLLLFISSPSLSLRVSISTLSSPVCFFLPL
###Gene_Info_Comments GLEAN3_21382 ###
Inspection of the tiling array suggests that glean may have missed the following exons: FGEEDPFTERLEHFMTMPQDQGQFRYTDMMDTNLPHTMDGEEYLQQTMDTSSNSFTCDLCQKPCGNRTNYFVHLRIKHAEELPYECGICEERFSTEIILKDHIHSSHTGDEGDFFKCDVCAARFSEKQYLHAHMLSHNEYATFPCHICRKTFMKRKDLQKHLSSHAKTIRESIPCPMCNKVFRLPRNLAIHLRTHSATFICQLCSEKPQSQEVPTSDDHKDGLGRTSTPEENCTEQEEIPGDEQQLFCKVCSITFTDQEDRIGHKCKVNRCSNCLKDFSCPSLLAADGESCSSEENPMCDTCNKAFSLMDKVKPDEHASKVEKGTVQCRMCLKRFPKPILNYSKPISNHPKRNQPCAAKEASNSPLRFPCRICGNIYFMKSTLRKHVKVHEREHIYLCEICDTIFHKKKTYKRHLKVHDEKRLKCPKCKVT,GPTVLPLYVSFVLRNLSRRKFLHLMTTKMALVGLLLLKKTVQSRKRYQGMSNNSFVKYVASLLQIKKTASDINARLTAAVTA
###Gene_Info_Comments GLEAN3_21383 ###
Inspection of the tiling array suggests that glean may have missed the following exons: IEKSPSVQEIKSDLDTKGGENQPQEALHQKIHLPGESQTSSENLVAVGSSPEQLPRHHCIYCKEEFQSMQTLLHVDHACVRINSTTKRCPICPKQVGSRKKFRQHIVSHNLTCKTRHKNQAANRMMDISQVKSPGQTCRCEWCHKEFDDRNKLIDHTRIHFAYGRNQCPVCERWYTNTTYFRQHVRVHGIILSDKDRQYNIQQAGPSKEQHRRCLWCDKKFQSSDKLAAHTYLHVARGRNCCPVCGKWVGRRKRSVFRAHLLTHGIKTFCDDKGQSCKRSVKGSHATQGLKQEVKSVVDGLTQSEVQGENSSRVPTHRCAWCHLGFHDLETLVNHTHDHIHSGMNRCPVCCLSFSGVWGFKLHLSVHGIPTPYRK,QDDVLSLGRSPLSSHTQTFADSLTSGTPPCVSNSDQGPPHPDENVQAILDTVSMAASQVQTSCRLPEHCCQWCGREFDGLEELVAHTRIHIKDGKNQCPVCKKWLSNKSNFKQHLKRHGIIPPFENKSVSIEHIRSNSMFISQMGKQRKSSTSIKKSYLSRKSSKYKTSLSIGSSDLKSLRSAHKRMLIDHPSSASSHSQRDEMSLDLKSLRSAHKRMLIDHPSSASSHSQRDEMSLDLKSLRSAHKRQLIDHPSSASSHSQRDEMSLDLKSPRSAHKRQLIDH,VIDGREKGLLEENVAGDLESPITRVVRVQITKENYQDLIEDETFDFQNEMDPCQELPHAKEPCILSDPQSGPFEYGTCRLCKKACGNRRMYLMHLRTKHSEELPYECQVCKARYLDEEDLESYNRKRVGQIENGSKEENTCELCGDKTSYACHTCNMEFDNRPKLIVHQKVHCKGKHCYCHLC,SGELPVSQIDSKDEDVHRAKGQNDDISCKRPHLAKDTWTMREQTKAETPHGDGRCDEVQQSSDERGRSNSSKSHVPHVLPVGRQRQNQESQQSLEKKQNVQITGNEMSGSQVGQKGMKKQTAGLGKNDDSVLTLY,SLDANSDASASQGQRKEVEQQTNQCHIDHISHSQRSLHLTSDTQALTESPEPANQPIHRCLWCVQEFSSLNELNVHLQIHIVKGKNRCPVCKKWLCNEYYFKTHLQLHGFKIEKQRQIKSLKKKSETNQITKSQIIFRRAPVRGQRH
###Gene_Info_Comments GLEAN3_21843 ###
Inspection of the tiling array suggests that glean may have missed the following exons: PFFSLHHFFMNFSALLLLFTSACSLGFCPLYTSSLIHLLSSCSCFFLNFNLPVDSPRLTNYFGACRKTKQITTDVHKKRSKFLSAKEIPSNFFFYLILLTGQER,TFLPCFCFSRVLAPSGSVLSIRLPLFTCYLLALVFSLILIFLLTLQDSPTTLGPAEKQNKLQPMSIKNVQSSFLPRKYPPIFFFI,LKKKKYSWLQHYYPKVTNEQAIVACSYSQKIGARNHMYSMIFSKACAAMRMYQAVESVDVLCGEHISKLFQKEVLCWPVQSYAHVDTIHVMQEAVCRLRTC
###Gene_Info_Comments GLEAN3_22473 ###
Inspection of the tiling array suggests that glean may have missed the following exons: PVLEGISPVSQDSSQVLPSMENSSQSERELDESINDPTNSPPTHDEDSTGKQGFDCTKCKKRFSVESDLGSHVMMCCGNLSTQCPVCKKIFASKSYIGKHMRLHTGEKPFQCGECGMRFTRKHHLVHHQRTHTGEKPFKCTE
###Gene_Info_Comments GLEAN3_22474 ###
Inspection of the tiling array suggests that glean may have missed the following exons: SEDGCLEVDNEGASEGEHLKKEDHEDSESGWLPVLEGISAVSQDSSQVQPRLENSSQSERELDESMNDPTNSPPTCDEDSTGKQGFDCTKCKKRFSVESDLGSHVMTCQSARRFSRQSRI
###Gene_Info_Comments GLEAN3_22892 ###
Inspection of the tiling array suggests that glean may have missed the following exons: KASFSIASLPSLFSILLRLPSPSLVLPLSFPWSLSFLYLVLSLSLTLSPSSSHCFPLPLPLSLCLCLSFSTLSPFYLHHLRLAHLYL,RHLSQLHLSPLSFLSCSVFLHLVLFCPSPSLGLSHFYISFSPSLLHFHPLPLTVSLSLFLSRSVSVSLSLLSPLFICIISDWLIYIYNNLIQGNSRQFLSFGHSLPCTLTIVSRFHIIGVGY
###Gene_Info_Comments GLEAN3_22893 ###
Inspection of the tiling array suggests that glean may have missed the following exons: NPSMETGKNFKCKKCSRLFTTSGGLRKHLLRCHEKHHVMKSRLRSRQTQTPEACMTRAGRDSSDQPIDTIKTSAGNLTPGEESEEFCFKTKCLNQRKVCNICLKSFAMFKF
###Gene_Info_Comments GLEAN3_22955 ###
Inspection of the tiling array suggests that glean may have missed the following exons: ISICPIFEHSMNLNSRIINSFAVSLPSLLYSHHPRLLTLHPFLLSHQRHLLLAASHPRPRLCCCSATCVTGWYSYVCTTRCGGSGGLQRDAVAYGWPVLGSIRLFDVAFQSIPACTDLAPLTTTLI,IRGLLIRLPFLSRLFCIRIILASSHFILSSSPISAICFLPHPILVLGFVVVAPPASPDGTVTSAPPDAVAAVGFSGTQLPMGGLYSEVSGSSM
###Gene_Info_Comments GLEAN3_23727 ###
Inspection of the tiling array suggests that glean may have missed the following exons: LALYALYQLHERPPMQVPPPPQQDRFLNICLGGAVRHKRSFSKFASRITFWGLKFFWSREWAINLLYRNCPHPPLIFIKTARS,QVAICGWYRDGDVNAAIQKIVQIVQNTICPLIALHIYQPPFYFAPPPPPFSLSFSLCPSVKLEPYPAPFSPSLFPLKTSAWSNKE
###Gene_Info_Comments GLEAN3_24745 ###
Inspection of the tiling array suggests that glean may have missed the following exons: HMCVQSEERPYQCSLCNKTFSTSRQCLKHIRAHADGKPYQCPHCPRRYAEESTLVGHVSRAHSAGKTYQCPLCNKSFSRMSNLKLHARSHTGEKPYKCSVCGKAFSRMSNLTRHTRFHTGAKPFECSYCNGRFTEKRNLIQHMRIHTGEKPFECSICNVTFSRNGSLTR,FFSSDGNCSLDSTKQKTRLLHQNLGERELKDEQRQLLDGKVKCVVIRKRGSDSLQTVEIHVEAPVDKSKREDLLNGSSCSCSACNRETRFVETAERNTCTSDPASFEDEGGWITIEKEKEFDDE,LNADHNNHRGSCSEEKLIQCLDHDKKPYKCSFCGKAFSIMSNLTQHSRFHTGAKPFECSYCDGRFTEKKNLIKHMKIHTGENKPYECSICNKTFSRYGSLTSHKRT
###Gene_Info_Comments GLEAN3_24748 ###
Inspection of the tiling array suggests that glean may have missed the following exons: RIDICGKQSTGSLSVRSSFDNPVSASSTARLLSRGSPVTLSSVGIASAVPPSASSFVESSSGESPAALSSSLGLVVEASVCNTGPTSFSPFCSSPKLLAVSGTFSEMDITCLSNPSHGIDCQGVHSSILRQSSSACI,RSWAASWLTGPPPDISLAASSTAKPVPGGSLVNASPARGSTSVGSNQQALCLSGHHLTTLYLRPRLLGCCPVGLQLLCLLLG,GSYTVISTMKNKPPICNSMAAFPRESKCEPSAWCMVFTTLCKTGCLPSPILLKTLSAMNDCVDPVSIRNVTGCPSTLATMKRPCRF,FSHQRKRRRSLSSLGSRTCENMPLISAIKATLSCRNLVKTPNSVFVRSGPCNSSRLRQKTPNSAEQSKTTLSFFFCGCTTPWCGRYQTQPSSLLVSCSGTFSI
###Gene_Info_Comments GLEAN3_24877 ###
Inspection of the tiling array suggests that glean may have missed the following exons: DNHGSPTVFPHRTFILSEALCGFDFFIFVSPVSFISNLPNLRTSSEKVISVNPKARSSLANLDIIIVLFLQAGSSFFSLILTHSHLLELVL
###Gene_Info_Comments GLEAN3_24901 ###
Inspection of the tiling array suggests that glean may have missed the following exons: FFLSLFRFLHMKKSMCVDQSTQTMPDIYSCSHCGSSLLAKPMPPLARPQPSGPQQLPGLESSQARCINKKDNGDDYTYDDGIVTLDNEKELLGPDKWVSEAPDHPVAIFVKEEVMSDLEQSNSMHSDKEESFFDPRQDQEFRSDSKLEHSETGWYEEEEEEEEEEEEDEDEMEDDEEIDYEALERLDPTYDPFIGSRPHQLVSIVNF
###Gene_Info_Comments GLEAN3_24953 ###
Inspection of the tiling array suggests that glean may have missed the following exons: MDLQNHRVPVQCSDSSEDILPCSDYIKFVQNSFNKTYKDSLYSERGVQTRCQHNIKKYHYRNSYEEVAMPVLSLFIRRQWNSGVWVHLDLDEIYLNLLDAPCDLRVSRE,WIYRIIGCLFNVQTLQKIFCPVQTILNLFKTHLTKRIRIHCIVKGAFRLDANIISRNTITEIVMKRSLCQCFLCLLDVNGTPGCGFI,ISIIITVIIIFTFINLILTNIYFTTINFTIFTSIIITVIIIFTFISIIVTFIIIFITIMEIILLIIVITNINVISITRSPHFSSFFWSSSSSSSLSYII,SASSLLSSSSSPSSTSSSRTFTLLPSISPSSPASSLPSSSSSPSSASSLPSSSSSLPSWKSSSSSLSSPTSTSSALPVPLIFLPSSGLPLHHHHCHISYR,QHHHYCHHHLHLHQPHPHEHLLYYHQFHHLHQHHHYRHHHLHLHQHHRYLHHHLHYHHGNHPPHHCHHQHQRHQHYPFPSFFFLLLVFLFIIIIVIYHIE
###Gene_Info_Comments GLEAN3_25848 ###
Inspection of the tiling array suggests that glean may have missed the following exons: NEVFFILFFNLLCSIPLTLGFIVPSFTFENLYPPLIQTDTKSLGKLRGIAFCLLFHFLHISIISFLAPCDLLLCCFPSSFSEAWTFQGFCLRWKTFFIFISFCLPFCTFQFLKFLHYFRMSHVVIMNFA,LLSMCLFRSCFVSNCESQSRHTMAFSSFSIDVLMVVKSFLSLPTSNLLSSFVSFCAVSWIAFCSIISLGSHVPSCDSWTFKSAVCLFLSK,TLRKWATIKALLRNDFLHWSQLYTGTCRMKIFDAVQSCNLCRSKSHVLFKITWQSGHGCPSLGWRVELGQSLSTTFAVPDVWVSLSSPSSGSVIHAA,LHALAKDEHIIGVFMDLSKAFDTLDLTLDHDILLHKLYHYGVRGVSLNWSCSFLSCRSQYTVFDNAKSTMSSFTCGNVCHRALF
###Gene_Info_Comments GLEAN3_25850 ###
Inspection of the tiling array suggests that glean may have missed the following exons: IYSCLFFSFLFSVSIHFGDEQHIKTEHGVAQPVQTEDTVPIREKLVTPATVSQHQDSKEVQIDYDDDEDNSYAAWMTEPDDGDDNDTQTSGTAKVVESDCPSSTRQPREGHPCPDCHVILNSTWDLDLHRLHDCTASKIFIRHVPVYNCDQCRKSFRRRALMVAHLRKVHDNNMTHSEIVEKLEELKSTERQTKGNKDEKCLPSETKALKRPGLREGRGKTTKKKVAGSKKRDDGDMQEMKQQTKSNAPQLSKRLCIRLNKGWIKILESERWNDEPQGQGNTAEQVEEKDEEDLILTRQVELVIEDGERETSTMMPISESKISEAVLKEAAEGVEENQDASTISHLERKRQTADLKVHESQEGTCEPREMIEQKAIHETAQKETKEESKLDVGKDKNDFTTIKTSIENEENAIVCLLCDSQFETKHDRNKHMLNSHTEHRQLYKCSTCGKTFVQK,HIKETKYTCELCGKLFYTTGAIKLHVDSHNKERAFKCEECGKGFLRAYLLKVHNETVHSNASHCLCEVCGSAFKSQSNLKQHNLTAHTDVYKYSCDVCGKKFKRTTHRNAHMKVHSNDPANKPFKCKLCSKVFAAQARLKVHMDWHYNIRSHTCDVCGKSFLTKGNLDKHQYVHKDKKPHECQICFHGFVD,RSRPGAFLVLGGFCLGAGIGLLGAGTGLCCTVLSASLCNNPPSDMSCICSVSVSAWTPTPSSTRSSVCDIPLSSSDWSTC
###Gene_Info_Comments GLEAN3_26209 ###
Inspection of the tiling array suggests that glean may have missed the following exons: SRDGARHLPEDFPLQDYLGTQSDSSERESCTSENDSYLIPSQVAYFHEKRAADRETPDATTSSHEPGRMDIDRNADITMEYNEMNNQSSQLHATGSDQNDPSCSLTEEKRFLCHVCSKGFYFKCRLSRHMEIHGIEKAPSKKSHQCMVCDLRFSRV,EGSHFKSKIHISSHSLCSGGDSSLQSSYQSMRNSQEMSGDRSCDGARHLPEDFPLQDYLGTQSDSSERESCTSENDSYLIPSQVACFHEKQA,SVCGKSFREKSTHTKHMTTHSGEKPHVCLICNKAFSNTSGLSRHNKIHTGERPYECSFCKKTFSQTHHLSRHIKIHTGERPFECSVCSKTFSERGYLTEHQRVHTGEKPYFCSICEKRFTSNSCCKRHMRIHTGEKPFPCS
###Gene_Info_Comments GLEAN3_26418 ###
Inspection of the tiling array suggests that glean may have missed the following exons: TRNELKNPRHLPRPRPRTFMYKSLVDIDRYRDDVTPNRTRRASLYTIEPSMVFKEIKDSQQIGYTRVNRVSENVLQRGSIVNEQVSEWLMWITTRNAFDIIMPHL
###Gene_Info_Comments GLEAN3_27147 ###
Inspection of the tiling array suggests that glean may have missed the following exons: KPEWRPIPFYVELPPFWLLADEISSPIPHIFPLSPFRATPPFLPQVFLVPNPYTPPCPLAYTTFYLLMYTTLAHSSCHTPIPSRLPPLYHNSHLLLLTTPFFTLSFIIFVLGGVLYLSNVFQEGKKKRGKDN,STYYYTCTCTIVIVAAFFCTLLVCCNDEKPYLTLKSAPHHASLWSRLNLYSLEASSAAMRLARMLVPLCPGLWCQVKRFCGRHPGHYNKVISKDRPKEKQKVVSIDRWSLY
###Gene_Info_Comments GLEAN3_27709 ###
Inspection of the tiling array suggests that glean may have missed the following exons: SSTVFHHQHLSSPLHAPPSCARSKPSANCRPAHIHYIQRVLSCDPVCDSLGTTCWGKRSRNSGKGSNPIGHRPPLLLHLLRGPDSGGTSHA,LVFVHELTCAKPTLADIALVRFLTCVFLLVVIKCAFRGKGSGAKGALERSVLGMLLDMNQQLAFHVKAFLTEAAGEFAAAEVDLVLVALENVAIDEPSRTKLALERAGTIVIVHGFPEHVDLFDLVLVLCWVGSILFNLAYLRCLLYLGTSSCQWLNTMMLPQMIMQEAVSDECLTADIASSRP,CHIGSCWKNSNLSFGSSLLQVTGMWFNAMYFLKMCIKGCWVCETLYTKITFKITRPLSFGTTELVLSFKHFPMFLLLVSRQALLIDVSLVTICTRPRGKRRVEGAWLVGGGSTMFHTFMSHYKLFLCVHQPTHITSIWSFIVILDMCFERVFVRAIKPTLGALVFVTFCLTGLSFSIGGTVPFLKMKQKSVLLFIKTSTLRAPKGIQLVSFHMALKLLRTCTGVATSLAKILTRSVPPYPQLNLQPFFTTSTFRVHCMLRLLVPGQSLPRIADLPTFITSKGCFLVTQYVIR,PLKGPKRTPAGSVRSATTQTVFCPSFLYASPSGRGPPRGASSGSLVVSAASTGASACSDGARLGAELVCGGPDCSPAGTMCG,YAIHNNIMCRAPKGENRGEKQISMRLFYKTNRIYLQICTHDPLTKLPTRDLHRLGDLLFETFITLILLVRFISNFCHSVHLVYLLSYK
###Gene_Info_Comments GLEAN3_27753 ###
Inspection of the tiling array suggests that glean may have missed the following exons: KNFTHIKKGYHQSPPTVCYQLPLFVCLVFPSLSPSVSVSLSHSVSTALYPPVSPSLSVSLSLSKVYLSHLLASCKSFSFRCLRIVSLSEDDLENRLE,SSQPATHMGLCCCCSLRRAASNAAALTLCLSLPAALLPEARKNRSEWLADGWSTEVESGACLVADVSWLAADGGAEYVDDVDGCEWDVPC
###Gene_Info_Comments GLEAN3_27912 ###
Inspection of the tiling array suggests that glean may have missed the following exons: RSFPLALELAVGQTSPEEIVALQHKWLESHLSPVGAASISARACTFLTVLTRNPPLGSGHCIMIFSCMSLLLRDQTPASSDLRNQPKLAELW,SGSFFRAFASRLSLTSSFSSGGSSSSSGGSSSSSHAHSTSSSAPSTSSNGPSTSSNGPSTSSDGPTSMSASGSPAVSASVAWLVGSSSYCVHESLV,GFRFPFEPDFIFFLRWLILFFRWLILFLPCPLYFLQCPFYFLQWPIYFLQWPIYFLRWPHFYVCFRVSSCVCFSSLAGWILLILCP
###Gene_Info_Comments GLEAN3_27919 ###
Inspection of the tiling array suggests that glean may have missed the following exons: NVSDAQKFQCGMCEDLFTSAKLILRHMRCEHSQDRTEDLYPVLMWKRKKKKKALETENSSQEDLTIENKVSEQGDLVFKCRTCEKVFSCHGRLKEHESFHKFSQGHACPVCDKKETNSRTLAKHMKTHEPLVLKCKECNRIYKTKSALRKHLNEFHGHQCRICSERFPYMTDCKKHEQTHQGSTKNGAHTGKSLLSASSPEEPKDMLGKTSNYYQRPFKCRYCPKRYSLRSSVKTHEKERHTGDLVFKCPHCPKVFGREYRLIDHLRSHEENRMYRCKLCPKTFGSESALTNHQGEHTGLKPFSCDICSKGFRIKKAVQDHKRRIHQKRQMRFFCSVCNKGFADKGNFTKHERRHKGVRPYVCLECGKGFTAKSCLTTHIKAMHTAEKPFSCELCGKTFSLNQNYTYHMFRHKEQGDISSIQQ,SYLKDSGPVHSSANHLHHAHLQRPVAFGGHGVKFSSLPASYTPPITAQEPAVERNDVPVSLTTCVPVVERTDLPTTTRESVIDIPPISMRESPTDSCIPPPTKQELAEDSYMSPSTAHESPIDSSISPTTTQESVEDSSIPATTTNEQVTDSINIEPVTTSEVAMDEDIPPAKSWGTVEERSIRLVTTHKPQAEGNT,LYFFCHFFLSPGQTPIAPPAPSKPLHSKEQRKQDTSSSSQSNTSPPASAVEPSLLAALNLKKSRMLSSTPELPQGLSFRWTLEGKVEGVVAKGTVEKGSEFGPYPGSLMNEEQGLSKD,ALTSNLVEFFPLYHMTTCWLDSNIVFEGIAYDRVVLEAVLCNLEIYYGRAIKFCVICHNQQTEVALLNNIIDPGFGEPTHLAIG,SSFYEQWKVITPLVTTSLKGINKKTDNSNNRHHSSNHNHNSNHSSSNMLHTIPLICQYHTTHPLSISLSIWGTINRPEFSQYHTQDLWHKCSVQFIMVILINLILIF,AMEGNYPPGYYLPERDQQENRQQQQPPPQQQPQPQQQPQQQQHVAHYPFDLPIPYHTSSQHQSEYMGDDQPTRVFPVPHPGPMAQMFSPVYHGNINKSDFDI
###Gene_Info_Comments GLEAN3_28222 ###
Inspection of the tiling array suggests that glean may have missed the following exons: ADNCSGLQGFLIFHSFGGGTGSGLNALLMERLSVDFGKKSKLEFAIYPAPQVSTAVVEPYNSILTTHTTLEHSDCAFMVDNEAIYDICRRNLDIERPSYQNLNRLIGQIVSSITASLRFDGALNVDLTEFQTNLVPYPRIHFPLVTYAPVISSEKAFHEQLSVSEITTSCFEPLNQMVKCDPRHGKY,TLSSWSVSPSTSARSPNWSSPSIRHLRFPPLLSSHTTPSLPLTPPSSTPTVPSWSTTKPSTISAVVISTSSVRHTRTSTV,LAKVQRAVCMLSNTTAIAEAWGRLNHKFDLMYAKRAFVHWYVGEGMEEGEFAEAREDLAALEKDYEEVGIDSCDAEAEDDEDY
###Gene_Info_Comments GLEAN3_28746 ###
Inspection of the tiling array suggests that glean may have missed the following exons: FRSFTEIDFWQNEEDGRCFDGETRESGDDDHGGNNKSCECKVTEPSTGSVCSYCQKKHERTDSGGKPSLQCFFCDCSFSIECHLTRHLQFHVGMKTYDTFHCSLCKKSFLSKSDLVKHKTKCTGEKPYECIHCTSTFAKQTDLKVHIRTHNQVKNILTVQTQDQTEHSYGQSQSQCPYCKRAFKTKSTLDSHIGTMTFENSYSCSHCSSTFRSKCSLTLHNRTHKYQCFLCNKRFASLDGRNTHVKWHTGVKPHHECSYCSKKFSKKCHLDEHVRIHTGEKPYRCSYCEKGFRTKGNFTKHLKIHNGGNNEEG,KQQGEREQRLTPVKEVGLCLACYMKEESSMEFYIKEEKLLFYEAETGKSDRDQESLQEDVKQSCVDEKGWIDMFAEPEIASSALSSDPQASETVLIAVGQESVLEDDERIGDTSQESSERESVPPTQ
###Gene_Info_Comments GLEAN3_28753 ###
Inspection of the tiling array suggests that glean may have missed the following exons: RISCYNTNIHDEALLLYAISHVYFSHPMMQTSCYIENIDIYVSCCEELDYTWIEMYCYNENMNKNFQRSLFLCVSQGWTLF
###Gene_Info_Comments GLEAN3_00017 ###
Inspection of the tiling array suggests that glean may have missed the following exons: KLKPLETRMMDRLEERHQKERPWETRMIDRPEERHQKERPWETRMDRPEERHQKEKPWKTKMMERPWETRMMERPWETRVMDRPEERHQKEKPWETRMIDRPEERHQKEKPWETRMMDRPEERHQKEKP,QTLAEVKVETIGDKNDGQAGGKTSKGETMGDENDRQTGGKTSKGETMGDEDGQAGGKTSKGETMEDENDGETMGDENDGETMGDESDGQTRGKTSKGETMGDENDRQTGGKTSKGETMGDENDGQTGGKTSKGETIGDENDSSIQFISHHF
###Gene_Info_Comments GLEAN3_00437 ###
Inspection of the tiling array suggests that glean may have missed the following exons: VVALLPADTDLLVVELEVVGAQDLAVAAEYSDSDFLDVVSLDVVVLEAVGLFVVAAGLTDVADVVLFEVAVGLHDVVVFALHLVDDVMSEDQQEVVPGVALLLVAVILDYWAVPGSHLVDILVIGLHFH,HIIIKYRSADVVCTRRLGYWFQTTELAHEVHCNITQAMGRRRAKSIFGTGVISTPLFSTKLFGTGNVDNLANMKSHFGTLYYRPM,FEAEQFCMFIYNNITIYMFLHSVSVDTIVQCRYYPCLCHFTVPTSQIIHLQIAMLCIKITRSPRTPKWYPNQPRYISHNTPSPCFALETKIHTSNIELPETCI
###Gene_Info_Comments GLEAN3_00440 ###
Inspection of the tiling array suggests that glean may have missed the following exons: PLQEFTLFFAPNCVLFLHESVMLSVQSREFGKKKFCCVTSSLFLSLRQTGSVHSTVGVHFNYDDAGTSRMHESLSSLSLPFQYQ,SRCLRIRPRRLLRRLTKRLPLRLSRRLLLRLSQRLLKRLPRRLTRRRPRRFLRRPLRSPPRRLSKSESSLRTPRVESMSS,IKMPQNKTKKAAEKANEKAPIKAVKKASVTAVAKTSEKAAKKAHEKAPKKVPQETSEKPAEKVVKKRVKSEDPKSGVHVFLGAKDFPRWCVMKEKSGCKTNVQLMRWLLGIAEKYFG
###Gene_Info_Comments GLEAN3_00540 ###
Inspection of the tiling array suggests that glean may have missed the following exons: IQININKGENHITRRERILFLHSDNYFDLPVSNSLAFKQLTMAHTLIVSYPTHLAPTGLFHHLTSLKNRHTHDLPNDYFC
###Gene_Info_Comments GLEAN3_00578 ###
Inspection of the tiling array suggests that glean may have missed the following exons: THTREKLYECSHCQKSFSHKGNLTQHLLTHTGEKPYECCSCKKGFSQKSTLNCHILTNGRKAIRVFTTKTHIAEKPYVHIVVNGFLKKVLLNIYVRKVFLTNAISHPTPTNTHRKKAL
###Gene_Info_Comments GLEAN3_00603 ###
Inspection of the tiling array suggests that glean may have missed the following exons: HKIFIYFWSGCVNGGGQIKAEEGITAKELLVEVEELYNSAPTRTPEENEVVQTLYEEWLGGVGSEKARTMLHTQYHALEKNTNALNIKW,TEEGRSRLKKASQRRNYSWKLKNCTTLLLRGHPRRTRWSRLFTRSGSEEWAVRKQGLCCIRSTMRWRRTQMLLTSSGRTGIGICRLGN,QPNINYLFLTQQRGXXXXXXXXXXXXXXXXXXAQSLRVKPDCIYHVTVMPCYDKKLEASRDDFYDDVYRTRDVDCVITSGMYLLFCLALF
###Gene_Info_Comments GLEAN3_01483 ###
Inspection of the tiling array suggests that glean may have missed the following exons: NPKQNIFVFFLNPEKPWQGEDKPSPSSQPKKSSSSRMKRKIKEENTPNDEVLDHVTRVTGLETVDGRIVDEEGEPEPESKKKMKKAKRKHI
###Gene_Info_Comments GLEAN3_02586 ###
Inspection of the tiling array suggests that glean may have missed the following exons: SLSCVYYPKKYIFIKFKHLPGLPYINYAVLAWGKSLITQLDKLFLAQKRVIRTICNADFCAHTNPLFYHHRILKVEDIYYMQLGSLMYDLNSGVLALAKIFKKNNQIHNYTTRSASAFHLPHARTKFTLNSLVCNRPRFWSTLVLTPVSICLQT,QGQTITKAFFILPSPYFIPGYFLCFSFLFPIPLLYLQMFYSLFLLHLSVPPPPPTNLSSSIILSIYLLFLIFSPLSHSLSISPLHLFPFCVKGSIAFFNFL,INNIFILPIYIHVHLPWIHLCTAYVHHLGYQGVFLPCNHSKGKLSQKLSSSSQVLISFLGIFFVFLFFSPFPFCIFKCSIPFSFFISLYPPPPPPISLPQSYFPYISFS,IISLFYQFIYMCIYLGFIYAQRMCTISATRECSYHVIIARANYHKSFLHPPKSLFHSWVFSLFFFSFPHSPFVSSNVLFPFPSSSLCTPPPPHQSLFLNHTFHISPFLNLFPPLSLAIHFPPPPIPLLCQRFNCFLQFS,LYSVTCTCTCMLGKLENPFCQNSPFPESLKKECTTNILFTLIILIILLDCTFPLNFCGALKIMMRSIIIKYLASSVPGRRKAN
###Gene_Info_Comments GLEAN3_02674 ###
Inspection of the tiling array suggests that glean may have missed the following exons: SWNSSDLMLRVSASWRVCSQLLVPFGCELHPTLMILFLSGSETCVWYSGILYLLLSISNKMYNGGIRWRISILDSMVNLNGASTCGC
###Gene_Info_Comments GLEAN3_02708 ###
Inspection of the tiling array suggests that glean may have missed the following exons: LNEWLTSKSTCTCALTTSFLINLLVLKLLPIGLACISSTNELLILPSGHSILKPWHGHINIPRTRVSCAWPKCFFNQGKGPLQWNNVV,LSSRFLHMKKRVCVDQATQTMPYVYSCNHCGSSPLANKPMPPLARIQPSGPQKIKLMDRVSNPLFSGPKTPNIKVIMIKKSTAGVLKRDSRMTPQALNNLMQLGSNLQAAQPLQGNLTGSS
###Gene_Info_Comments GLEAN3_03207 ###
Inspection of the tiling array suggests that glean may have missed the following exons: ERFHRNQALKKHMMRHAGNEPYPCSECNVRCLSKPGLVRHMATHSGTKDHQCCKCGKMFARPHDLRKHEQSHEEEPETYLCFICGQTFDHKKNYHAHIGTHTRRQHGGRPTCKAKDSSETSHLLNTSDGRIS,DVYPSLVLCGIWLHIVVQKTTSVANVGKCLLDHMISASMNNHMRKNLRLICVLYVARRSIIRKITMLILVRIREDSMVDVQHAKQKTLPKPLIC
###Gene_Info_Comments GLEAN3_03490 ###
Inspection of the tiling array suggests that glean may have missed the following exons: IGWFQRIRSVAAKMLILLPQPTMDLTACTNAKNTILKVRFSYSLLEFWYILLQPSPSALSVTFLSPKEKQLVLPAYTYLYVSC,RVALRDYVFSHGPTRGPIEYNTNTDDATAAITCVQHTQKPKCEENFGLFSGTRADHVEIKLLTNSNQVHDVNQHLPKLHPL
###Gene_Info_Comments GLEAN3_03649 ###
Inspection of the tiling array suggests that glean may have missed the following exons: IYSVSRKDGYRPYYIFPLMKYHISNEPPVIIYNFIKKKIHLLVRKVTYNSSRHHSVTSSPITELCKADRVKTSAVIAPGVNTSEEIARGKTGVLADRCVCVCR
###Gene_Info_Comments GLEAN3_03848 ###
Inspection of the tiling array suggests that glean may have missed the following exons: EGESCKKAKRKRKPTKPIKFMEYCSEAERIVVGNLPSPKATASYSTDQVMLMETDSSALNNEQFSQSSTQHMYCLLNFKIYECLQCGLGFASEKAMNVHIRTHSKEKPYWCTECDIGFTEHQLYMAHKQSHRPCKCDECGASFGNGSTLKNHKLLHLQSKNFKCSVCPKMFKQRAGLTCHMRSHTDERPYLCKECGAAFVDNKSLQNHMSVHSDEKAFKCSVCPKMFKQRAGLAHHMKHHNDDKQYLCKECGAAFAYNIHLQNHKAIHSDEKTFKCTICPKMFKQRAGLTGHMKAHTDEKPFMCELCGKSVKTKSTLKKHRMIHSEEKPYQCPLCPQAFKQRAGLSQHSHKHGEGNPY
###Gene_Info_Comments GLEAN3_04802 ###
Inspection of the tiling array suggests that glean may have missed the following exons: KLMNTFLLLFPPQSFYKNQTPLRARQTIKMKLKKQQKNPITTIVIMMMMTALQEKEVGRKAPWLEGVKAPNLRRNRQRNHLNHHRQRNTNATSVTLS,YLEINEHFSFTFSSPVILQKPDTSQGKTDNKDETQEAAEESNHNNCNNDDDDSPAGEGGREEGTLAGGGQGPEPEAEQAEEPFEPPPPEEHECNICNSVLTSLWELD
###Gene_Info_Comments GLEAN3_04806 ###
Inspection of the tiling array suggests that glean may have missed the following exons: EGYSIFHTFLVVHLALSFPTRDLHLRPLVLLLLLLLELRGRGTYCALFGLCDLFLGMYLHVSVKMETFADLGADGTGLELGRGRFTHVGVHVGIDLHLI,GIQHLPYLSRSPSRPELPYSRSSSEASCAAAAAAAGAAGKRNLLRPLWPLRSLPWNVSSCVGKDGDLCRPWCRWDRSGAGARALHPCWRSRRN
###Gene_Info_Comments GLEAN3_04807 ###
Inspection of the tiling array suggests that glean may have missed the following exons: NPVWMFVCLFMSFWELNSLKQCGHGCFRFGSVRSGCSCAANVPPLMSWWRRSFFWARDLQMSQMYENSVRSGEVSSWFKTSASFPQPPHLTLTVQGSRK
###Gene_Info_Comments GLEAN3_06058 ###
Inspection of the tiling array suggests that glean may have missed the following exons: FTKYQLSIHMKNHPEVKPFQCSACDKRFSLKSYLAQHMKYHSDKKTHQCPMCPKGFIRNSVLQEHIKTHASEKPFECAMCGKRFSSKISLAVHMKKVCKKRPDREQQDNPLPSV
###Gene_Info_Comments GLEAN3_09398 ###
Sp-MAP2K5 spans two glean prediction:
GLEAN3_09399 and GLEAN3_09398
###Gene_Info_Comments GLEAN3_00751 ###
This model was annotated based on a manual inspection of multiple protein sequence alignments and domain structure.

This model belongs to a family of models that contain a Membrane attack/perforin domain (MACPF) and show similarity in domain structure to apextrin. The sufix D.1 is arbitrary and just reflects the clustering of this model with other Sp-MACPF genes in phylogenic trees.
###Gene_Info_Comments GLEAN3_01794 ###
This model was annotated based on a manual inspection of multiple protein sequence alignments and domain structure.

This model belongs to a family of models that contain a Membrane attack/perforin domain (MACPF) and show similarity in domain structure to apextrin. The sufix C.1 is arbitrary and just reflects the clustering of this model with other Sp-MACPF genes in phylogenic trees.

It is noteworthy that GLEAN3_01797 (Sp-MACPF-C.2), a model adjacent to this gene, also contains the MACPF domain. A comparison of their protein sequences reveals high similarity but a fair number of differences as well. It is to be determined whether this fact reflects the erroneous assembly of different haplotypes (both genes are indeed located in an area of numerous contigs) or if reflects a true gene duplication event.
###Gene_Info_Comments GLEAN3_05223 ###
This model was annotated based on a manual inspection of multiple protein sequence alignments and domain structure.

This model belongs to a family of models that contain a Membrane attack/perforin domain (MACPF) and show similarity in domain structure to apextrin. The sufix A.1 is arbitrary and just reflects the clustering of this model with other Sp-MACPF genes in phylogenic trees.

NB: Given the position of this model in a small scaffold and its similarity to typically larger models, it is likely that this is an incomplete model. Also, other gene prediction protocols generated slightly different models for this gene. For lack of better evidence, we have decided to accept this glean model in its present form. In addition, GLEAN3_22091 shows high sequence similarity to this model, but some differences as well (including sequence gaps); it is yet to be determined if these two models might represent haplotypes wrongly assembled as two different genes.
###Gene_Info_Comments GLEAN3_06818 ###
This model was annotated based on a manual inspection of multiple protein sequence alignments and domain structure.

This model belongs to a family of models that contain a Membrane attack/perforin domain (MACPF) and show similarity in domain structure to apextrin. The sufix G.1 is arbitrary and just reflects the clustering of this model with other Sp-MACPF genes in phylogenic trees.

NB: This model is located at the end of a scaffold. We cannot rule out that this model has been forcedly truncated during the gene prediction process. In fact, its only functional domain (MACPF) lies next to the end C-terminus of the predicted protein, which is uncharacteristic of the other members of this family of genes.
###Gene_Info_Comments GLEAN3_14677 ###
This model was annotated based on a manual inspection of multiple protein sequence alignments and domain structure.

This model belongs to a family of models that contain a Membrane attack/perforin domain (MACPF) and show similarity in domain structure to apextrin. The sufix C.3 is arbitrary and just reflects the clustering of this model with other Sp-MACPF genes in phylogenic trees.

NB: Other gene prediction protocols incorporate additional C-terminus sequence in their models for this gene, but without adding new identifiable domains to the predicted protein. For lack of better evidence, we have therefore decided to accept this model in its current form.
###Gene_Info_Comments GLEAN3_16546 ###
This model was annotated based on a manual inspection of multiple protein sequence alignments and domain structure.

This model belongs to a family of models that contain a Membrane attack/perforin domain (MACPF) and show similarity in domain structure to apextrin. The sufix B.3 is arbitrary and just reflects the clustering of this model with other Sp-MACPF genes in phylogenic trees.

The structure of this gene is highly supported by the embryonic genome-wide tiling array hybridization data, and by identical models generated by almost all gene prediction protocols.
###Gene_Info_Comments GLEAN3_17952 ###
This model was annotated based on a manual inspection of multiple protein sequence alignments and domain structure.

This model belongs to a family of models that contain a Membrane attack/perforin domain (MACPF) and show similarity in domain structure to apextrin. The sufix A.4 is arbitrary and just reflects the clustering of this model with other Sp-MACPF genes in phylogenic trees.
###Gene_Info_Comments GLEAN3_22091 ###
This model was annotated based on a manual inspection of multiple protein sequence alignments and domain structure.

This model belongs to a family of models that contain a Membrane attack/perforin domain (MACPF) and show similarity in domain structure to apextrin. The sufix A.2 is arbitrary and just reflects the clustering of this model with other Sp-MACPF genes in phylogenic trees.

NB:  GLEAN3_05223 shows high sequence similarity to this model, but some differences as well (including sequence gaps); it is yet to be determined if these two models might represent haplotypes wrongly assembled as two different genes.
###Gene_Info_Comments GLEAN3_10248 ###
Incomplete Protein kinase domain
###Gene_Info_Comments GLEAN3_07159 ###
This model was annotated based on a manual inspection of multiple protein sequence alignments and domain structure.

This model belongs to a family of models that contain a Membrane attack/perforin domain (MACPF) and show similarity in domain structure to apextrin. The sufix E.1 is arbitrary and just reflects the clustering of this model with other Sp-MACPF genes in phylogenic trees.

NB: This model is located at the end of a scaffold. We cannot rule out that this model has been forcedly truncated during the gene prediction process. In fact, its only functional domain (MACPF) lies next to the end C-terminus of the predicted protein, which is uncharacteristic of the other members of this family of genes.
###Gene_Info_Comments Sp-Gg1d ###
Gene model derived from est matches. Predicted protein sequence matches exactly to GLEAN3_14498 (Sp-Gg1), but an additional intron seems to be present in 3'utr.
###Gene_Info_Comments Sp-Gg3 ###
glean inferred from est data
###Gene_Info_Comments Sp-Gg4 ###
Gene predicted based on homology to human sequences, but this locus might be a pseudogene (because it doesn't have introns).  It seems that it is expressed though based on tiling array data.
###Gene_Info_Comments GLEAN3_00986 ###
This gene model is located at the end of a  very short contig. The nucleotide sequence of the first exon has 94% similarity to that of another Sp-Tlr gene. It could be a member of Toll-like receptor. The second exon that was 100000bp separated from the first was eliminated.

###Gene_Info_Comments GLEAN3_13279 ###
Gene fragment- likely haplotype of Glean3_18156- see annotation of that gene.
###Gene_Info_Comments GLEAN3_18156 ###
Complex gene with varying exon predictions among different models. Tiling data inconclusive.
###Gene_Info_Comments GLEAN3_07980 ###
Haplotye of Glean3_25345- see that gene for annotation.
###Gene_Info_Comments GLEAN3_05834 ###
This model was annotated based on a manual inspection of multiple protein sequence alignments.

This model Blasts best to Cathepsin C, and it strongly co-clusters with it in a NJ phylogenetic tree. We have therefore decided to name it "Cathepsin1" but note this high similarity by making "CathepsinC" one of the synonyms for this model.
###Gene_Info_Comments GLEAN3_09042 ###
This model was annotated based on a manual inspection of multiple protein sequence alignments.

This model Blasts best to Cathepsin L, and it strongly co-clusters with Cathepsin L and several other glean models in a NJ multiple protein alignment tree. We have therefore decided to name this model "Cathepsin 2" for consistency, while noting its high similarity to Cathepsin L by making "CathepsinL-like1" one of its synonyms.

NB: The structure of this model is highly supported by the fact that other gene prediction protocols generated almost identical models and by the genome-wide tiling array hybridization data.
###Gene_Info_Comments GLEAN3_09368 ###
This model was annotated based on a manual inspection of multiple protein sequence alignments.

This model Blasts best to Cathepsin L2; however, a NJ multiple protein alignment tree shows that its sequence is equally related to that of Cathepsin L, L2, K and S. For this reason we decided to name this model only "Cathepsin 3" and not group it with other Sp-CathepsinL-like models.

NB: The structure of this model is highly supported by the fact that other gene prediction protocols generated almost identical models.
###Gene_Info_Comments GLEAN3_09601 ###
This model was annotated based on a manual inspection of multiple protein sequence alignments.

This model Blasts similarly to Cathepsins X, Y and Z, and it strongly co-clusters with them and GLEAN3_13893 in a NJ multiple protein alignment tree. We have therefore decided to name this model "Cathepsin 4" for consistency, while noting its high similarity to Cathepsin Z by making "CathepsinZ-like1" one of its synonyms.

NB: Other gene prediction protocols generated noticeably different models for this gene. However, the genome-wide tiling array hybridization data indicate that all exons of this glean model would be expressed at similar levels during embryonic development, which supports the idea that they all belong to the same gene. For lack of better evidence we have decided to accept this glean model in its current form.
###Gene_Info_Comments GLEAN3_13893 ###
This model was annotated based on a manual inspection of multiple protein sequence alignments.

This model Blasts similarly to Cathepsins X, Y and Z, and it strongly co-clusters with them and GLEAN3_09601 in a NJ multiple protein alignment tree. We have therefore decided to name this model "Cathepsin 5" for consistency, while noting its high similarity to Cathepsin Z by making "CathepsinZ-like2" one of its synonyms.

NB: The structure of this model is highly supported by the fact that other gene prediction protocols generated almost identical models and by the genome-wide tiling array hybridization data. It must be noted, however, that the N-terminus for this model is located very close to one end of the scaffold. For this reason, we cannot rule out that more N-ter sequence not included in this model exists.
###Gene_Info_Comments GLEAN3_14767 ###
This model was annotated based on a manual inspection of multiple protein sequence alignments.

This model Blasts best to Cathepsin L, and it strongly co-clusters with Cathepsin L and several other glean models in a NJ multiple protein alignment tree. We have therefore decided to name this model "Cathepsin 7" for consistency, while noting its high similarity to Cathepsin L by making "CathepsinL-like3" one of its synonyms.

The structure of this model is highly supported by the fact that other gene prediction protocols generated almost identical models. It must also be noted that both adjacent models (GLEAN3_14766/GLEAN3_14768) show signficant similarity to this model, which raises the question of whether these may reflect gene duplications. A careful inspection of their sequences and domain structure suggests that GLEAN3_14766 is a true cathepsin gene (annotated as such by Esther Miranda - Duke University), whereas GLEAN3_14768 may have been generated as a result of an assembly error (see GLEAN3_14768 for more details). Both models show significant differences at the aminoacidic level with GLEAN3_14767, which would argue that they are due to true gene duplication events, something that is also observed among vertebrate cathepsin L and L-like genes.
###Gene_Info_Comments GLEAN3_14768 ###
This model was annotated based on a manual inspection of multiple protein sequence alignments. Based on such analysis, we propose that this is very likely an incomplete/wrong model. A careful analysis of its sequence and structure domain indicates that its true last coding exon (containing a STOP codon) may fall in a sequence gap
adjacent to its second last coding exon (i.e. the last coding exon may have been forcedly incorporated to this model). For lack of supporting evidence for this claim, we have accepted this glean model in its present form.

This model Blasts best to Cathepsin L, and it strongly co-clusters with Cathepsin L and several other glean models in a NJ multiple protein alignment tree. We have therefore decided to name this model "Cathepsin 8" for consistency, while noting its high similarity to Cathepsin L by making "CathepsinL-like4" one of its synonyms.

The (wrong?) structure of this model is highly supported by the fact that other gene prediction protocols generated almost identical models. It must also be noted that its adjacent model (GLEAN3_14767) shows signficant similarity, which raises the question of whether this may reflect a gene duplication. A careful inspection of their sequences shows some noticeable differences at the aminoacidic level which would argue that they are due to a true gene duplication event, something that is also observed among vertebrate cathepsin L and L-like genes.
###Gene_Info_Comments GLEAN3_14765 ###
This model was annotated and modified based on a manual inspection of multiple protein sequence alignments. We have modified this model based on the models generated by NCBI and FgeneshAB, both of which show a good alignment to cathepsin L-like genes. The original glean3 prediction presented exon and domain structures that clearly resembled artificially fused genes. In fact, the remaining of the original glean3 prediction, which is represented as well by respective NCBI and FgeneshAB predictions, closely resembles genes present in othey phyla, which supports the claim that the original version of GLEAN3_14765 wrongfully fused to separate genes.

This modified model Blasts best to Cathepsin L, and it strongly co-clusters with Cathepsin L and several other glean models in a NJ multiple protein alignment tree. We have therefore decided to name this model "Cathepsin 9" for consistency, while noting its high similarity to Cathepsin L by making "CathepsinL-like5" one of its synonyms.

NB: The adjacent model (GLEAN3_14766) shows signficant similarity to this modified GLEAN3_14765, which raises the question of whether this may reflect a gene duplication. A careful inspection of their sequences shows some noticeable differences at the aminoacidic level which would argue that they are due to a true gene duplication event, something that is also observed among vertebrate cathepsin L and L-like genes. Although we cannot rule out, based on this analysis, that they may be different alleles of the same gene and that the apparent duplication be due to an assembly error.
###Gene_Info_Comments GLEAN3_28748 ###
Amino terminal domain truncated- no obvious 5' exons. Tiling data not consistent with gene models.
###Gene_Info_Comments GLEAN3_14914 ###
This model was annotated based on a manual inspection of multiple protein sequence alignments.

This model Blasts best to Cathepsin F, and it strongly co-clusters with Cathepsin F in a NJ multiple protein alignment tree. We have therefore decided to name this model "Cathepsin 10" for consistency, while noting its high similarity to Cathepsin F by including "CathepsinF-like1" as one of its synonyms.

NB: The structure of this model is highly supported by the fact that other gene prediction protocols generated very similar models and by the genome-wide tiling array hybridization data.
###Gene_Info_Comments GLEAN3_15668 ###
This model was annotated based on a manual inspection of multiple protein sequence alignments.

This model Blasts best to Cathepsin L, and it strongly co-clusters with Cathepsin L and several other glean models in a NJ multiple protein alignment tree. We have therefore decided to name this model "Cathepsin 11" for consistency, while noting its high similarity to Cathepsin L by making "CathepsinL-like6" one of its synonyms.

NB: The structure of this model is supported by the fact that other gene prediction protocols generated almost identical models and by the genome-wide tiling array hybridization data.
###Gene_Info_Comments GLEAN3_20838 ###
This model was annotated and modified based on a manual inspection of multiple protein sequence alignments. We have modified this model based on the corresponding model generated by NCBI, which shows a better alignment to cathepsin L-like genes (although it generates an incomplete cds - i.e. no stop codon). The remaining (C-ter) of the original glean3 prediction does not Blast back to any known sequence in nr nor does it contain any identifiable functional domain, which suggests it may represent an artificial fragment fused to the model for lack of an earlier stop codon in the scaffold.

The modified model Blasts best to Cathepsin L, and it strongly co-clusters with Cathepsin L and several other glean models in a NJ multiple protein alignment tree. We have therefore decided to name this model "Cathepsin 13" for consistency, while noting its high similarity to Cathepsin L by making "CathepsinL-like8" one of its synonyms.

NB: The adjacent model (GLEAN3_20837) shows signficant similarity to this modified GLEAN3_20838, which raises the question of whether this may reflect a gene duplication. A careful inspection of their sequences shows some noticeable differences at the aminoacidic level which would argue that they are due to a true gene duplication event, something that is also observed among vertebrate cathepsin L and L-like genes. Although we cannot rule out, based on this analysis, that they may be different alleles of the same gene and that the apparent duplication be due to an assembly error.
###Gene_Info_Comments GLEAN3_20837 ###
This model was annotated and modified based on a manual inspection of multiple protein sequence alignments. Its structure is supported by almost identical models generated by all other gene prediction protocols.

This model Blasts best to Cathepsin L, and it strongly co-clusters with Cathepsin L and several other glean models in a NJ multiple protein alignment tree. We have therefore decided to name this model "Cathepsin 12" for consistency, while noting its high similarity to Cathepsin L by making "CathepsinL-like8" one of its synonyms. It should be noted, however, that this model lies close to one end of the scaffold, and we cannot therefore rule out that additional N-ter sequence has been left out in the assembly process.

NB: The adjacent model (GLEAN3_20838) shows signficant similarity to this model, which raises the question of whether this may reflect a gene duplication. A careful inspection of their sequences shows some noticeable differences at the aminoacidic level which would argue that they are due to a true gene duplication event, something that is also observed among vertebrate cathepsin L and L-like genes. Although we cannot rule out, based on this analysis, that they may be different alleles of the same gene and that the apparent duplication be due to an assembly error.
###Gene_Info_Comments GLEAN3_17141 ###
This gene is present on three GLEAN predictions. GLEAN3_17141 contains the first ~890 AA and GLEAN3_12930 and GLEAN3_21952 have the rest. GLEAN3_12930 is the largest piece with ~1400 aa.
###Gene_Info_Comments GLEAN3_12930 ###
This gene is present on three GLEAN predictions. GLEAN3_17141 contains the first ~890 AA and GLEAN3_12930 and GLEAN3_21952 have the rest. GLEAN3_12930 is the largest piece with ~1400 aa.
###Gene_Info_Comments GLEAN3_21952 ###
This gene is present on three GLEAN predictions. GLEAN3_17141 contains the first ~890 AA and GLEAN3_12930 and GLEAN3_21952 have the rest. GLEAN3_12930 is the largest piece with ~1400 aa.
###Gene_Info_Comments GLEAN3_02324 ###
embryonic lethal, abnormal vision, drosophila, homolog-like 1; Hu antigen R
###Gene_Info_Comments GLEAN3_10853 ###
Putative pre-mRNA splicing factor ATP-dependent RNA helicase DHX15 (DEAH box protein 15)
###Gene_Info_Comments GLEAN3_08911 ###
This model was annotated based on a manual analysis of multiple protein sequence alignments and domain composition. Its structure is supported by almost identical models generated by other gene prediction protocols.

This gene aligns well with and shows a domain structure that resembles genes of the heme-dependent peroxidase superfamily, which includes a large number of genes from various phyla. These genes are most typically identified based on their biological function (e.g. ovoperoxidase, lactoperoxidase, myeloperoxidase, eosinophil peroxidase, etc), but they all present a secretory signal peptide and a single haem-peroxidase domain.

NB: The N-ter of this model is likely missing, based on the structure of the genes with which this gene aligns best and the fact that this model is located at the end of a scaffold.
###Gene_Info_Comments GLEAN3_19097 ###
This model was annotated and modified based on a manual analysis of multiple protein sequence alignments and domain composition.

This modified model aligns well with and shows a domain structure that resembles genes of the heme-dependent peroxidase superfamily, which includes a large number of genes from various phyla. These genes are most typically identified based on their biological function (e.g. ovoperoxidase, lactoperoxidase, myeloperoxidase, eosinophil peroxidase, etc), but they all present a secretory signal peptide and a single haem-peroxidase domain.

NB: Different gene prediction protocols generated noticeably different models for this gene (mostly towards the N-ter of the model). The NCBI model shows the best alignment with other heme-dependent peroxidases to which this model Blasts back, and thus we chose it to modify this glean prediction.
###Gene_Info_Comments GLEAN3_02004 ###
This model was annotated based on a manual analysis of multiple protein sequence alignments and domain composition. Its structure is supported by almost identical models generated by other gene prediction protocols and by the embryonic genome-wide tiling array hibridization data.

This gene aligns well with and shows a domain structure that resembles genes of the heme-dependent peroxidase superfamily, which includes a large number of genes from various phyla. These genes are most typically identified based on their biological function (e.g. ovoperoxidase, lactoperoxidase, myeloperoxidase, eosinophil peroxidase, etc), but they all present a secretory signal peptide and a single haem-peroxidase domain.

NB: The N-ter of this model is missing, based on the structure of the genes with which this gene aligns best and the fact that this model is located at the end of a scaffold.
###Gene_Info_Comments GLEAN3_10343 ###
DEAD Box containing protein. Similar to either DEAD (Asp-Glu-Ala-Asp) box polypeptide 5 OR DEAD (Asp-Glu-Ala-Asp) box polypeptide 17.
###Gene_Info_Comments GLEAN3_24969 ###
the original glean is a C-terminus of snx13 gene; the N-terminus is on the same scaffold - the previous glean GLEAN3_24968; I've combined predictions here.
###Gene_Info_Comments GLEAN3_06014 ###
Added 3' exon from Angerer model due to alignment to mmp13.
###Gene_Info_Comments GLEAN3_24968 ###
n-terminal piece of Sp-RGS-PX1; full annotation is in Glean GLEAN3_24969
###Gene_Info_Comments GLEAN3_18333 ###
First exon position is uncertain: some discrepancy with the est. Neither est nor homology searches give a nice fisrt exon prediction, the exact region must not be in the assembly yet.
###Gene_Info_Comments GLEAN3_07675 ###
pre-mRNA splicing factor SF3a (60kD). GLEAN3_16711 is a duplicate prediction.
###Gene_Info_Comments GLEAN3_16711 ###
Duplicate prediction. GLEAN3_07675 is complete gene.
###Gene_Info_Comments GLEAN3_11993 ###
Human gene is significantly longer. But the prediction may be complete as Elegans and Drosophila genes are about the same size.


This gene encodes subunit 2 of the splicing factor 3a protein complex. The splicing factor 3a heterotrimer includes subunits 1, 2 and 3 and is necessary for the in vitro conversion of 15S U2 snRNP into an active 17S particle that performs pre-mRNA splicing. Subunit 2 interacts with subunit 1 through its amino-terminus while the single zinc finger domain of subunit 2 plays a role in its binding to the 15S U2 snRNP. Subunit 2 may alsofunction independently of its RNA splicing function as a microtubule-binding protein.
###Gene_Info_Comments GLEAN3_04897 ###
Likely missing 3' exon not presenton contig.
###Gene_Info_Comments GLEAN3_02029 ###
Haplotype of Glean3_04897 (Sp-mmp24d).
###Gene_Info_Comments GLEAN3_13513 ###
Duplicate prediction for GLEAN3_26806
###Gene_Info_Comments GLEAN3_26806 ###
Distal part of the SF3a120. First part is predicted by GLEAN3_27895.
###Gene_Info_Comments GLEAN3_27895 ###
GLEAN3_27895 contains the first part of the gene. GLEAN3_26806 codes for the tail end. GLEAN3_13513 is a duplicate prediction for GLEAN3_26806.
###Gene_Info_Comments GLEAN3_00866 ###
Likely missing 5' exon based on blast alignments.
###Gene_Info_Comments GLEAN3_27966 ###
PHD finger-like domain protein 5A (Splicing factor 3B associated 14 kDa protein) (SF3b14b)
###Gene_Info_Comments GLEAN3_02161 ###
This gene encodes one of four subunits of the splicing factor 3B. The protein encoded by this gene cross-links to a region in the pre-mRNA immediately upstream of the branchpoint sequence in pre-mRNA in the prespliceosomal complex A. It also may be involved in the assembly of the B, C and E spliceosomal complexes. In addition to RNA-binding activity, this protein interacts directly and highly specifically with subunit 2 of the splicing factor 3B.
            This protein contains two N-terminal RNA-recognition motifs (RRMs),
            consistent with the observation that it binds directly to pre-mRNA.
###Gene_Info_Comments GLEAN3_00101 ###
Likely has an extra-exon predicted.
###Gene_Info_Comments GLEAN3_17104 ###
The nucleotides of the second exon has 93% identity to another typical Sp-Tlr gene, while that of the first exon doesn't. This gene model may be a member of Toll-like receptor.
 
###Gene_Info_Comments GLEAN3_03478 ###
Missing one or more exons at the beginning of the gene. GLEAN3_03477 may represent one of the missing exons or it may be a duplication of the GLEAN3_03478.
###Gene_Info_Comments GLEAN3_03477 ###
Missing one or more exons at the beginning of the gene. GLEAN3_03477 may represent one of the missing exons or it may be a duplication of the GLEAN3_03478.
###Gene_Info_Comments GLEAN3_13007 ###
GLEAN3_17802 is a duplicate prediction.
###Gene_Info_Comments GLEAN3_17802 ###
This is a duplicate prediction for GLEAN3_13007
###Gene_Info_Comments GLEAN3_21629 ###
similar to mitosis-specific chromosome segregation protein SMC1. GLEAN3_26628 appears to be a duplicate prediction over the last exon(s).
###Gene_Info_Comments GLEAN3_26628 ###
GLEAN3_26628 appears to be a duplicate prediction over the last exon(s).
###Gene_Info_Comments GLEAN3_12607 ###
small nuclear ribonucleoprotein D1 polypeptide (16kD); snRNP core protein D1; Sm-D autoantigen"
###Gene_Info_Comments GLEAN3_08908 ###
Modified gene model to reflect cloned cDNA. Included sequences present on small scaffolds 85810, (167442 and 56237 both match same region), 160020.

###Gene_Info_Comments GLEAN3_24030 ###
GLEAN3_10849 is likely a duplicate prediction for GLEAN3_24030
###Gene_Info_Comments GLEAN3_10849 ###
GLEAN3_10849 is likely a duplicate prediction for GLEAN3_24030
###Gene_Info_Comments GLEAN3_19147 ###
GLEAN3_21883 is a duplicate prediction of GLEAN3_19147
###Gene_Info_Comments GLEAN3_21883 ###
GLEAN3_21883 is a duplicate prediction of GLEAN3_19147
###Gene_Info_Comments GLEAN3_12978 ###
Lacking C-ternimus.  See GLEAN3_28726, _27443, _08472. 
###Gene_Info_Comments GLEAN3_05089 ###
This gene model has no TIR domain. But the nucleotides encoding signal peptide, LRRNT, LRR(15-23), CT has 88% similarity to another typical Sp-Tlr. The unknown sequence (NNN) in the 3'end could makes the gene model incomplete.

###Gene_Info_Comments GLEAN3_01231 ###
This gene model has no TIR domain. But the nucleotides encoding LRR(12-21) and CT has 85% similarity to another typical Sp-Tlr. Unknown sequence (NNN) at the 3' end of this gene model could makes the gene model incomplete.

###Gene_Info_Comments GLEAN3_06611 ###
This gene model has no TIR domain. But the nucleotides  encoding SP, NT, LRR(12-23), CT has 86% similarity to another typical Sp-Tlr. The unknown sequence (NNN) at the 3'UTR could makes the gene model incomplete.

###Gene_Info_Comments GLEAN3_08150 ###
This gene model has no TIR domain. But the nucleotides encoding SP, NT, LRR(12-23), CT has 86% similarity to another typical Sp-Tlr. The unknown sequence (NNN) at the 3'UTR could makes the gene model incomplete.

###Gene_Info_Comments GLEAN3_08227 ###
This gene model may represent a pseudogene or contain a sequence error. 1200bp of 3'UTR was accepted to a coding region that is encoding TIR domain. 

###Gene_Info_Comments GLEAN3_09172 ###
This gene model has no TIR domain. But the nucleotides encoding SP, NT, LRR(9-21), CT has 85% similarity to another typical Sp-Tlr. Unknown sequence (NNN) at the 3' end of this gene model could makes the gene model incomplete.

###Gene_Info_Comments GLEAN3_28478 ###
Sp predicted similar to CG5680-PB
###Gene_Info_Comments GLEAN3_00151 ###
dulicate accession: NP_999689
GLEAN3_10284 is a nearly identical internal duplicate
###Gene_Info_Comments GLEAN3_16748 ###
One of 2. Overlaps with GLEAN3-17385, and matches exactly in the overlap. This sequence is longer in the C term, and appears to lack the start codon. Only the c-terminal 2/3 of the protein matches with cdc2L1, and the string of Es appears to not be correct.
###Gene_Info_Comments GLEAN3_17385 ###
One of 2, GLEAN3_16748 overlaps with this sequence. They match exactly (DNA & protein). This gene is longer at N term. and appears to contain the start codon, although internal sequences appear to be bad, and the long runs of Es aren't 'real'.
###Gene_Info_Comments GLEAN3_08111 ###
matches middle of human/mouse gene sequence
appears to be missing exons on beginning/end
###Gene_Info_Comments GLEAN3_24925 ###
Notes: 
-1 bp missing after 34146
-1 bp insertion at 36263
-2 bp insertion at 39462-3
-2 bp missing after 39814
-18 bp missing after 41009
-15 bp missing after 41117
-mismatch btwn 41178-94
-12 bp missing after 41291
-6 bp insertion after 41348
-mismatch btwn 41465-73
-6 bp missing after 41496
###Gene_Info_Comments GLEAN3_13575 ###
Prediction of last exon is likely to be incorrect.
###Gene_Info_Comments GLEAN3_14846 ###
1 of 2, the other is GLEAN3_18780. These 2 are overlapping and nearly identical where they overlap, although each has gaps with respect to the other.
###Gene_Info_Comments GLEAN3_18780 ###
1 of 2, other is GLEAN3_14846. These are overlapping genes that are nearly identical where they overlap, although each has gaps with respect to the other. This gene BLASTs with an e of zero to both CG7337-PA (XP_781545.1) and to MAPKBP1-like.
###Gene_Info_Comments GLEAN3_11815 ###
not the greatest homology. Possibly not correct.
###Gene_Info_Comments GLEAN3_09620 ###
1 of 2, the other is GLEAN3_09206, which seems to be an internal fragment of this gene. They are identical in the overlapping region. Both genes hit XP-794873.1 as well.
###Gene_Info_Comments GLEAN3_09206 ###
1 of 2, other is GLEAN3_09620. This one is a shorter version of 09620, and is identical. Both genes BLAST to both XP_794873.1 and XP_783732.1
###Gene_Info_Comments GLEAN3_24798 ###
GLEAN3_24798 represent the first half of the gene. GLEAN3_09840 is the latter half.
###Gene_Info_Comments GLEAN3_09840 ###
GLEAN3_24798 represent the first half of the gene. GLEAN3_09840 is the latter half.
###Gene_Info_Comments GLEAN3_19790 ###
glean prediction looks like a duplicated part of larger rgs12-containing region. See full annotaion with GLEAN3_04238.
###Gene_Info_Comments GLEAN3_02103 ###
Obtained Glean3_02103 from S. purpuratus genome by using N-terminal peptide sequence of NM_003972 (human gene BTAF1) with score of 72 bits and E-value 3e-13. Other Glean3 candidates had poor scores. 

The best genbank hits (XP_795066, E=3e-59; and XP_788365, E=2e-55) are predicted partial peptides similar to TBP-associated factor 172 (TAF-172) (TAF(II)170) of Strongylocentrotus purpuratus and are incomplete at the carboxy terminal. Predicted TAFs from other organisms also appear with high scores.

Blasting Genbank yields some sequence support from empirical data, yet raises some warning flags. Human BTAF1 RNA polymerase II, B-TFIID transcription factor-associated,170kDa, has an E-score of 2e-24. However, sequence data for an endonuclease/reverse transcriptase (presenilin gene) from Branchiostoma floridae (amphioxus) is also returned, with an E-value of 1e-24, and other reverse transcriptases from schistosomes, several mosquito species and chickens.

Upon examining the Genboree presentation of Glean3_02103, one observes: 1) The Glean, NCBI, and FgeneshAB predictions appear to span two different contigs while the Genscan model does not, by virtue of omitting the segment at the 5? end (this gene is on the ?strand); 2) The NCBI model lacks the largest exon predicted by all other gene models; and 3) No Exonerate or Splign data appear. 

Microarray tiling data from Systemix seem to indicate weak signals supporting two of the seven exons (5175-5338; 3816-4478), no support for two exons (1323-1481; 317-496), and questionable support for the remainder.

Upon observing the questionable support for the gene in its entirety, I selected the sequence corresponding to the exon missing in the NCBI data and blasted the translated peptide sequence against Genbank. This time, my query obtained a hit corresponding to XP788888, which is the predicted CDS for an endonuclease/exonuclease/phosphatase family and RNA-directed DNA polymerase (5R694) of Strongylocentrotus purpuratus. Moreover, when I blasted the N-terminal 240 aa against Glean3, the best hit was GLEAN3_03468 (484 bits, E=e-137) instead of the initial GLEAN_02103, and is a stronger match than the original.  


###Gene_Info_Comments GLEAN3_09223 ###
This gene model doesn't have a TIR domain. But the nucleotides encoding SP, NT, LRR(8-22), CT has 90% similarity to another typical Sp-Tlr. The unknown sequence (NNN) in 3'end of the coding could makes the gene model incomplete.
###Gene_Info_Comments GLEAN3_01621 ###
internal duplication: looks like the ends of two contigs in the scaffold (aagj01193203 and aagj01193210) are actually overlapping... hence one of the exons was duplicated in prediction & deleted in annotation (actually, 2 exons are duplicated because of this, but only one extra made it into prediction)
###Gene_Info_Comments GLEAN3_09450 ###
This gene model may represent a pseudogene or contain sequence error. A part of intron sequence matches coding sequence of other Sp-Tlr genes and contains stop codons. Modified model contains some frame shifts, but reflects best gene structure.
This is a member of sea urchin-specific Tlr Group IA.
###Gene_Info_Comments GLEAN3_11156 ###
This gene model doesn't have a TIR domain But the nucleotides encoding SP, NT, LRR(15-22) has 90% similarity to another typical Sp-Tlr. The unknown sequence (NNN) in 3'end of the coding could makes the gene model incomplete.

###Gene_Info_Comments GLEAN3_16689 ###
Sp-seawi is made up of glean3_16689 on scaffold 19510 and glean3_24002 on scaffold 55433
###Gene_Info_Comments GLEAN3_25996 ###
This is the only N-terminus and should be combined with GLEAN3_25997.  
###Gene_Info_Comments GLEAN3_11775 ###
Partial Toll-like receptor. The nucleotides encoding SP, NT, LRR(5-17) has 89% similarity to another typical Sp-Tlr gene. This gene model is located at the end of the scaffold.

###Gene_Info_Comments GLEAN3_12584 ###
This gene model doen't have a TIR domain. The nucleotides of the fist exon has 90% similarity to another typical Sp-Tlr gene, while the second exon could be wrong prediction. The fist exon is located at the end of the contig. That could make the gene model incomplete.

###Gene_Info_Comments GLEAN3_24002 ###
See Glean3_16689
###Gene_Info_Comments GLEAN3_01826 ###
GLEAN3_01826 predictions may be incomplete. GLEAN3_09266 matches partially completely with GLEAN3_01826.
###Gene_Info_Comments GLEAN3_18535 ###
GLEAN3_12748 appears to be a duplicate prediction for GLEAN3_18535.
###Gene_Info_Comments GLEAN3_02257 ###
GLEAN3_18588 is a duplicate prediction for GLEAN3_2257.
###Gene_Info_Comments GLEAN3_18588 ###
GLEAN3_18588 is a duplicate prediction for GLEAN3_02257.
###Gene_Info_Comments GLEAN3_02983 ###
Partial sequence.
Naked cuticle-2 is an EF hand calcium-binding domain protein similar to the recoverin family of myristoyl switch proteins.
###Gene_Info_Comments GLEAN3_25144 ###
GLEAN3_25144 contains the first part of the gene. GLEAN3_23888 contains the second half. Both the predictions overlap significantly.
###Gene_Info_Comments GLEAN3_23888 ###
GLEAN3_25144 contains the first part of the gene. GLEAN3_23888 contains the second half. Both the predictions overlap significantly.
###Gene_Info_Comments GLEAN3_16974 ###
This gene model doesn't have a TIR domain. But the nucleotides encoding SP, NT, LRR(7) has 98% similar to another Sp-Tlr gene(GLEAN3_13751).  This gene model is located in a short scaffold, which could make it incomplete.

###Gene_Info_Comments GLEAN3_14928 ###
GLEAN3_14928 was divided into two Sp-Tlr genes (Sp-TlrP10 and Sp-TlrP11). This gene model doesn't have  a TIR domain, but the nucleotides encoding LRR have high similarity to other typical Sp-Tlr genes. The model is located at the end of a contig.
###Gene_Info_Comments Sp-TlrP11 ###
GLEAN3_14928 was divided into two Sp-Tlr genes (Sp-TlrP10 and Sp-TlrP11). This gene model has unknown sequence in the 3' region, which could make the model incomplete. 

###Gene_Info_Comments GLEAN3_25813 ###
In complete prediction.
###Gene_Info_Comments GLEAN3_09091 ###
Scaffold_79280 missing 5' (leader) and 3' end (remainder of serine protease domain), probably because of incomplete sequence data.
###Gene_Info_Comments GLEAN3_28187 ###
Scaffold_80160 missing 5' start (leader sequence), one exon of vWF domain and 3' end (remainder of serine protease domain), probably because of incomplete sequence data.
###Gene_Info_Comments GLEAN3_11546 ###
Prediction possibly too long.
###Gene_Info_Comments GLEAN3_14932 ###
GLEAN3_14932 was divided into two Sp-Tlr genes (Sp-TlrP12 and Sp-TlrP13). This gene model doesn't have  a TIR domain, but the nucleotides encoding SP, NT, LRR (13-23) have high similarity to other typical Sp-Tlr genes. The model is located at the end of a contig.

###Gene_Info_Comments GLEAN3_28188 ###
Scaffold_80160 is incomplete.  Appears to be complement factor B, but missing vWF domain and most of serine protease domain, probably because of incomplete sequence data.
###Gene_Info_Comments Sp-TlrP13 ###
GLEAN3_14932 was divided into two Sp-Tlr genes (Sp-TlrP12 and Sp-TlrP13). This gene model doesn't have  a TIR domain, but the nucleotides encoding SP, NT, LRR (16-23), CT and TM have high similarity to other typical Sp-Tlr genes. The model is located at the end of a contig.

###Gene_Info_Comments GLEAN3_15299 ###
This gene model doesn't have a TIR domain, but the nucloetides encoding SP, LRR(7-13) has 89% similarity to another Sp-Tlr gene. This gene model is located at the end of a scaffold, which could make it incomplete.

###Gene_Info_Comments GLEAN3_15511 ###
This gene model may represent a pseudogene or contain a sequence error. 450bp of 3'UTR was accepted to a coding region that encodes a TIR domain. 

###Gene_Info_Comments GLEAN3_25703 ###
Incomplete gene model: expected 5' part of the gene is missing
###Gene_Info_Comments GLEAN3_15534 ###
This gene model doesn't have a TIR domain. But the nucloetides of the first exon has 91% similarity to another Sp-Tlr gene (GLEAN3_15533).  The first exon is located at the end of a contig, which could make it incomplete.

###Gene_Info_Comments GLEAN3_15789 ###
The first and second exons and the intron between them were accepted to this modified gene model. The nucleotides of them has 88% identity to another Sp-Tlr gene (GLEAN3_15066). The 3'end of the model is located at the end of a contig.

###Gene_Info_Comments GLEAN3_18099 ###
The gene model doesn't have a TIR domain. But the nucleotides encoding SP, NT, LRR(6013) has 89% identity to another Sp-Tlr gene(GLEAN3_15066). This gene model is located in a short contig, which may make it incomplete.

###Gene_Info_Comments GLEAN3_19041 ###
This gene model doesn't have a TIR domain. But the nucleotides encoding SP, NT, LRR(4-14) has 90% idenitity to another Sp-Tlr gene(13751). This exon is located at the end of a contig, which could make this model incomplete.

###Gene_Info_Comments GLEAN3_07261 ###
Partial sequence.
See putative conserved domains at:
http://www.ncbi.nlm.nih.gov/Structure/cdd/wrpsb.cgi?RID=1137537553-3176-9890171404.BLASTQ4
###Gene_Info_Comments GLEAN3_19835 ###
This gene model doesn't have a TIR domain. But the nucleotides encoding SP, NT, LRR(3-7)have 98% identity to another Sp-Tlr gene (26274). It is located at the end of a contig that is far from the next one. That may make this model incomplete.

###Gene_Info_Comments GLEAN3_20045 ###
There is 1414bp of unkown sequence (NNN) in this gene model, which could make it incomplete.  The nucleotides except unknow sequence have 85% identity to another Sp-Tlr gene(15303). So it could be a member of Toll-like receptor.

###Gene_Info_Comments GLEAN3_01665 ###
Similar to Dual specificity phosphatase 11.
See putative conserved domains at:
http://www.ncbi.nlm.nih.gov/Structure/cdd/wrpsb.cgi?RID=1137536261-8980-151643868263.BLASTQ4
###Gene_Info_Comments GLEAN3_20666 ###
This gene model doesn't have a TIR domain. But the nucleotides encoding SP, NT, LRR(9-21), CT have 88% identity to another Sp-Tlr gene (05950). There seems to be an assembly error in the contig of this model, which makes it incomplete.  

###Gene_Info_Comments GLEAN3_14625 ###
Does not clade with human Ppm1g in phylogenetic analysis.
###Gene_Info_Comments GLEAN3_24003 ###
Based on position on contig and alignment with best blast hit (human) it is likely this gene model is missing a 3' exon.
###Gene_Info_Comments GLEAN3_28884 ###
GLEAN3_28884 represents 5' end (exons 1-3) of the Sp-Stx16 gene. GLEAN3_28885 represents the 3' end (exons 4-11) of this gene.
###Gene_Info_Comments GLEAN3_20266 ###
Similar to Protein phosphatase 1D magnesium-dependent delta isoform .
###Gene_Info_Comments GLEAN3_28885 ###
GLEAN3_28885 represents the 3' end (exons 4-11) of the Sp-Stx16 gene. GLEAN3_28884 represents the 5' end (exons 1-3)of this gene.
###Gene_Info_Comments GLEAN3_05149 ###
Incomplete gene model: expected N-terminal parts are absent
###Gene_Info_Comments GLEAN3_21194 ###
This gene model doesn't have a TIR domain. But the nucleotides encoding SP, NT, LRR(8-18) has 91% identity to another Sp-Tlr gene (00615). This gene model is located at the end of a scaffold.

###Gene_Info_Comments GLEAN3_07282 ###
Similar.  Portions of this sequence are nearly identical to Tyrosine-protein phosphatase, non-receptor type 23. 
See putative conserved domains at:
http://www.ncbi.nlm.nih.gov/Structure/cdd/wrpsb.cgi?RID=1137537628-5867-112430593701.BLASTQ4
###Gene_Info_Comments GLEAN3_21299 ###
This gene model doesn't have a TIR domain. But the nucleotides encoding SP, NT, LRR(18-23) has 89% identity to another Sp-Tlr gene (14548). This gene model is located at the end of a contig, which could make it incomplete.

###Gene_Info_Comments GLEAN3_07337 ###
It's missing 5' and 3' end of the gene.  It overlaps with Glean3_18583.
###Gene_Info_Comments GLEAN3_18583 ###
It's missing 5' and 3' end of the gene.  It overlaps with Glean3_07337.
###Gene_Info_Comments GLEAN3_16411 ###
Blasts to PTPRM, but phylogenetic analysis showed that it was does not clade with the PTPR K/M/T/U group.  Renamed PTPRorph1. Partial sequence. 
###Gene_Info_Comments GLEAN3_21421 ###
This gene model doesn't have a TIR domain. But the nucleotides encoding SP, NT, LRR(13-23) have 87% identity to another Sp-Tlr gene (16468). This gene model is located at the end of a contig that is far from next contig.

###Gene_Info_Comments GLEAN3_21425 ###
The nucleotids of the first exon + the following 200bp have 87% identity to another Sp-Tlr gene. The first exon is located at the end of a contig and far from the second one(76000bp).  The second exon and below could be a wrong prediction.

###Gene_Info_Comments GLEAN3_23505 ###
Splign sequences support gene model.
###Gene_Info_Comments GLEAN3_00663 ###
This is clearly a partial sequence of a DUSP.  It has been tentatively identified as DUSP24.  Duplicate of GLEAN3_18623.  
See putative conserved domains at:
http://www.ncbi.nlm.nih.gov/Structure/cdd/wrpsb.cgi?RID=1137535380-30970-10768348136.BLASTQ4
###Gene_Info_Comments GLEAN3_21914 ###
Partial Toll-like receptor. The nucleotids SP, NT, LRR(14-23), CT, TM have 86% identity to another Sp-Tlr gene(00615). Unknown sequence (NNN) in the 3'end of the coding could make this model incomplete.

###Gene_Info_Comments GLEAN3_09937 ###
Similar to Intestinal acid phosphatase PHO-1.  Partial sequence.
###Gene_Info_Comments GLEAN3_23031 ###
Patial Toll-like receptor. The nucleotides of the fist exon + the following 683bp in intron have 90% identity to another Sp-Tlr gene (21162). This first exon is located at the end of a contig. and far from the second one (13500bp). 

###Gene_Info_Comments GLEAN3_18623 ###
Duplicated gene.  See GLEAN3_00663.
###Gene_Info_Comments GLEAN3_17995 ###
Similar to amPTPN3 and to Gallus gallus protein tyrosine phosphatase, non-receptor type 1.
###Gene_Info_Comments GLEAN3_16144 ###
The sequences for GLEAN3_16143 were added in front of 16144.  They appear to be part of the same gene.  Also, there is an RVT domain in this gene that probably should not be there.  May be a sequencing error.
Multiple duplications.  See also GLEAN3_08253, GLEAN3_16053, GLEAN3_19852, GLEAN3_20604, GLEAN3_22839, GLEAN3_24537, and GLEAN3_27101.  Blasts to PTPRA but part of a novel clade in phylogenetic analysis.
###Gene_Info_Comments GLEAN3_23193 ###
This gene model doesn't have a TIR domain. But the nucleotides encoding SP, NT, LRR(10-21), CT have 87% identity to another Sp-Tlr gene (28576). There seems to be an assembly error in the contig of this model, which may make it incomplete.  

###Gene_Info_Comments GLEAN3_16053 ###
Blasts to PTPRA, but in phylogenetic analysis it forms part of a novel clade with PTPRLec1, PTPRLec2, PTPRLec3, PTPRLec5, PTPRLec6, PTPRFn1, and PTPRFn2.  
###Gene_Info_Comments GLEAN3_22839 ###
Blasts to PTPRA, but forms part of a novel clade in phlyogenetic analysis.  See also PTPRLec1, PTPRLec3-6, PTPRFn1, and PTPRFn2.
###Gene_Info_Comments GLEAN3_23934 ###
Partial Toll-like receptor. The nucleotides of the first exon have 90% identity to another Sp-Tlr gene (09435). The first exon is located at the end of a contig and far from the 2nd - 4th exons.  The 2nd - 4th exons were eliminated.

###Gene_Info_Comments GLEAN3_23936 ###
This gene model doesn't have a TIR domain. But the nucleotides encoding SP, LRR(9-19) have 87% identity to another Sp-Tlr gene (14352). This gene model is located at the end of a contig, which could make it incomplete.

###Gene_Info_Comments GLEAN3_27101 ###
Multiple duplications.  See also GLEAN3_08253, GLEAN3_16053, GLEAN3_16144, GLEAN3_19852, GLEAN3_20604, GLEAN3_22839, and GLEAN3_24537.
###Gene_Info_Comments GLEAN3_22506 ###
Missing N-terminus.  
###Gene_Info_Comments GLEAN3_00971 ###
Missing C-terminus.  See GLEAN3_22506.  
###Gene_Info_Comments GLEAN3_01698 ###
Missing C- and N-terminus, but highly hit to GABA transporter.  For N-terminus, this prediction should be combined with GLEAN3_01697.  See GLEAN3_06561. 
###Gene_Info_Comments GLEAN3_00076 ###
Missing N-terminus.  See GLEAN3_06561, _01698.  
###Gene_Info_Comments GLEAN3_25257 ###
This gene model doesn't have a TIR domain. The nucleotides encoding SP, NT, LRR(15-23), CT have 88% identity to another Sp-Tlr gene (21420). This model is located at the end of a contig, which could make it incompelte. 

###Gene_Info_Comments GLEAN3_25613 ###
This gene model doesn't have a TIR domain.  The nucleotides encoding SP, NT, LRR(9-19), CT has 94% identity to another Sp-Tlr gene(07850).  The model is located at the end of a contig, which could make it incomplete.

###Gene_Info_Comments GLEAN3_14977 ###
Missing N-terminus.  See GLEAN3_03832. 
###Gene_Info_Comments Sp-TlrP34 ###
The nucleotides of the first and second exons in GLEAN3_26438 and the intron between them have 88% identity to another Sp-Tlr gene (GLEAN3_15066). The second exon is located at the end of a contig that is far from the third one. The third to 10th exons could belong to Sp-ABCH1.

###Gene_Info_Comments GLEAN3_27445 ###
This gene model doesn't have a TIR domain. But the nucleotides encoding SP, NT, LRR(15-23), CT, TM have 88% identity to another Sp-Tlr gene (23035). This gene model is located at the end of a contig.

###Gene_Info_Comments GLEAN3_28380 ###
This gene model doesn't have a TIR domain. But the nucleotides SP, NT, LRR(8-15) have 88% identity to another Sp-Tlr gene (14548). This gene model is located at the end of a scaffold, which could make it incomplete.

###Gene_Info_Comments GLEAN3_07205 ###
Missing N-terminus.  See GLEAN3_03832, _14977. 
###Gene_Info_Comments GLEAN3_09356 ###
Missing N-terminus.  See GLEAN3_03832, _14977, _07205.  
###Gene_Info_Comments GLEAN3_08246 ###
Missing N-terminus.  See GLEAN3_03832, _14977, _07205, _09356.  
###Gene_Info_Comments GLEAN3_08617 ###
Full length.  See GLEAN3_03832, _14977, _07205, _09356, _08246.  
###Gene_Info_Comments GLEAN3_16011 ###
Missing N-terminus.  See GLEAN3_03832, _14977, _07205, _09356, _08246, _08617. 
###Gene_Info_Comments GLEAN3_27712 ###
highly homologous to glean3_27713, located on the same scaffold
###Gene_Info_Comments GLEAN3_27713 ###
higly homologous to glean3_27712
###Gene_Info_Comments Sp-PLN ###
Absence of a complete cDNA prevents further identification of the gene, but it probably extends to another scaffold based on the protein's size and the position of the exisiting cDNA on the end of the scaffold.  
NOTE: 
- exon 8 is missing 9 bp after 2555.  
- exon 4 probably falls in the poly-N region 
###Gene_Info_Comments GLEAN3_05376 ###
Position on contig and alignments with best hit suggest the gene is missing 3' exons.
###Gene_Info_Comments GLEAN3_22727 ###
Haplotype of Glean3_05376.
###Gene_Info_Comments GLEAN3_18472 ###
Has similarity to hatching enzyme. Tiling data supports glean3 gene model.
###Gene_Info_Comments GLEAN3_05663 ###
Similar to Protein phosphatase 1 regulatory subunit 12B. Partial sequence.
See putative conserved domains at:
http://www.ncbi.nlm.nih.gov/Structure/cdd/wrpsb.cgi?RID=1137537271-5077-208204720659.BLASTQ4
###Gene_Info_Comments GLEAN3_05182 ###
Sp-064 has many exons that are found on several Glean3 models; 05182, 12439, 18503, 17239.  
There are additional Glean3 models that overlap and may represent the other allele.  These are: 
08381 overlaps with 05182; 
10678 also overlaps with 05182 but in a different region than 08381; 
18503 overlaps with 12439; 
10474 overlaps with 18503;
07883 overlaps with 17239.

GLEAN3-05182 has exons 1 through 14 of 23.
###Gene_Info_Comments GLEAN3_13607 ###
Similar to Sp-R-PTP-delta.  Partial sequence.  May be a portion of a duplicate gene.  Another Sp-R-PTP-delta, GLEAN3_00831, is not on the same scaffold.  GLEAN3_00831 is probably a duplicate.
###Gene_Info_Comments GLEAN3_12439 ###
This scaffold has exons 16 through 18.  Exon 15 is missing from the assembly.
Other Glean models overlap this region and may be the other allele.  GLEAN3_12439 overlaps with 18503.
###Gene_Info_Comments GLEAN3_25759 ###
Similar to R-PTP-delta. Partial sequence. See also GLEAN3_00831 and GLEAN3_13607. See structure of this gene at:
http://www.ncbi.nlm.nih.gov/Structure/cdd/wrpsb.cgi?RID=1137533949-23252-71761336427.BLASTQ4
###Gene_Info_Comments GLEAN3_18503 ###
GLEAN3-18503 includes exons 19-22 and part of exon 23 of this gene, which is also found on GLEAN3-05182, GLEAN3-12439 and GLEAN3_17239.  However, it does not include the 3' end of the gene which is located on GLEAN3-17249.
GlEAN3_18503 also overlaps with GLEAN3_10474.
###Gene_Info_Comments GLEAN3_17239 ###
Glean3-17239 includes the 3' end of this large gene.  It overlaps with Glean3-07883
###Gene_Info_Comments GLEAN3_10823 ###
Similar to R-PTP-mu. See also GLEAN3_22405.
###Gene_Info_Comments GLEAN3_22686 ###
Similar to R-PTP-mu.  Partial sequence. See also GLEAN3_06528, GLEAN3_16411, GLEAN3_18743,  and GLEAN3_26582. 
###Gene_Info_Comments GLEAN3_16669 ###
Similar to Dual specificity protein phosphatase 3. Partial sequence.
###Gene_Info_Comments GLEAN3_28174 ###
no domains were detected
###Gene_Info_Comments GLEAN3_18743 ###
Similar to R-PTP-mu. Partial sequence.  See also GLEAN3_06528, GLEAN3_16411, GLEAN3_26582, and GLEAN3_22686.
###Gene_Info_Comments GLEAN3_06528 ###
Similar to R-PTP-mu. Partial sequence. See also GLEAN3_16411, GLEAN3_22686, GLEAN3_26582, and GLEAN3_18743.
See putative conserved domains at:
http://www.ncbi.nlm.nih.gov/Structure/cdd/wrpsb.cgi?RID=1137537418-19107-58243529506.BLASTQ1
###Gene_Info_Comments GLEAN3_23162 ###
Blasts to PTPRT, but in phylogenetic analysis it forms part of a novel clade that also includes PTPRLec1, PTPRLec2, PTPRLec3, PTPRLec4, PTPRLec6, PTPRFn1, and PTPRFn2.  
###Gene_Info_Comments GLEAN3_01207 ###
Incomplete prediction/assembly problem. First 305 AA from the 544 AA protein are present. 
###Gene_Info_Comments GLEAN3_01597 ###
A description of an homologue of this gene appears in:
Emery,P., So,W.V., Kaneko,M., Hall,J.C. and Rosbash,M. Cell 95 (5), 669-679 (1998)
CRY, a Drosophila clock and light-regulated cryptochrome, is a
major contributor to circadian rhythm resetting and
photosensitivity
###Gene_Info_Comments GLEAN3_11174 ###
Matches_GLEAN3_08752. The tiling data indicates that in the vicinity of the gene are significant ORFs which might represent additional exons. ORFs: FRELDSLYEGISSITSSRRRVEVTTFLLDQLESNDRSNGQVMVAQQPTNPTNNHVNNNMNSGNLEQPMHHDSESDEGFEEMDTGVEGAVGQSHSVSPTPSDE
###Gene_Info_Comments GLEAN3_08752 ###
Matches_GLEAN3_11174.
###Gene_Info_Comments GLEAN3_17642 ###
The tiling data indicates that in the vicinity of the gene are significant ORFs which might represent additional exons. ORFs: STYIGPWCAGFGDLRARGSSGGSRPYTCTCTYTRILAIWGTGMIRDPVMKNKKFTVWFPKIGVHLIHRDLRYCAFVINLLICSSLATCFLSDGCQCLFKLL,DRFLSLFAHLECVSETLIGILSVCLLSSSQTWKWKSFIPVNIFLHLSCLQERCAANCGICFCYLQYLRMVVENIIKTFRD
###Gene_Info_Comments GLEAN3_06683 ###
Model_must_be_split_in_2. Transcriptome data indicates that Glean may have falsely predicted the following exons: 3,14.
###Gene_Info_Comments GLEAN3_23867 ###
Exon 2 is alternatively spliced in Paracentrotus lividus, Arbacia punctulata, and Sphaerechinus granulavis.
###Gene_Info_Comments GLEAN3_09262 ###
The tiling data indicates that in the vicinity of the gene are significant ORFs which might represent additional exons. ORFs: ASLSPTPHVHQRYPQGLGHLQSSPASLAYQTASRSQETCWRGVYQMVKLHQGQYSSVTVSCDLNSFGVCRRVLFQRLPSAGSASLVSSGCSCPVLGVFRIPLMRDIYQVTLASSLQTLYYL,QFQGTLCLLFWKVFLSFASLAFLILHQYLVYLLFLVYLFFVSPLLLLDFVCGSHWCLSLLNEFLLVWCVLHDYSQGFVLSEG,NGFYQQILRWLNSLYPSYTKDTSSRFLDREENGGLFLWLRFNDRFSLYPDLLHDLHDFLHGRLADDIRHKQVFLQHLTFTKGIHKV,TVFIPHTRRIHPQDFLIEKKMEVFFCGCGSTIDSLCTPTSFTIFMTSFMAVWLTISVISKSFSNTSRSPKVSTRSRASSKLTRFFGLPNSKSLSGNMLARGLSNGETSSRTVLICDSFLRSKLFWRLSASSFPTFTFSGLRFFGFFRMLLPCAWCVPDSFDERYLSSDTCFLTSDTVLSVTNSLVGIGLVTEVSSGALPNS,SRLLVESMPASLNVLSADGSSTSSILDSVCVCLILASFSGDWLLCALLSITISGDPVSTVLEGLPLFRFFGFSDPASVSGLPAFLGLPLFRFSSPSAGLCVWLTLVSFPTE,PMLISFSAGWCVTTASSKPTRDSVLNSLGGLPLFCFSSSCFISKMHALDLSFKAPLLSSVNLSNLISTVSILDLSIVDVLATTLGDKQDPGTIPLLGLPCFCLSPFPDPELDIFGWYGLNSLKGLPTFPVLFFSDSTSDSNCSVRASLPVD,FSFRGLPLFLFSLLSVPESIPDRLTSSSLSVDWLSSTLACKLSTDESDIDTFRGLPLFLFSELESTSESVPDFPNPAILSNDWPRLTLDPLLGLPLFLFCGLSAEVSTPS,SSIHTLPLHCANKHGFIPSPPTLSSSGLPPPPPSPPPPRPLLGSHKHAASPTTLAATTAATAATTAAESGPATGAVATVACRHAPPGSFHSSSH,TISSPCYVSPDVHKYRYTSACCLGLFCCASAIQIEIKILCWVIQNLYFLLLQYQGFYLSQQTKLRFSICVCVYLFMNYLCTCTMQPLIPCHCSLAPGFTFIWRV
###Gene_Info_Comments GLEAN3_23730 ###
Transcriptome data indicates that Glean may have falsely predicted the following exons: 5,8.The tiling data indicates that in the vicinity of the gene are significant ORFs which might represent additional exons. ORFs: AEPKLCRGTPTHSITVQHSIIWSMLTKPCASMHTFSRSNQGEKSKHFFIDIWTGSVKHVIFESCANSIDLSPSAYAIGDCLCRSRYAQTPMVDFIRTLKLVQ,QSSFDSGIILVTSLHSICHTNPLFPHHQSSPCRNLTQTKILHSPLHTTNLPHAETSRKPRYSIPLSHHPSSPCRNLTQTKILHPPLHRHTDYKSTDDLQHLSYSEQCNFLLGLLLD,HLYTPSATPTPSSHTTNLPHAETSRKPRYSTPPFTPPIFPMQKPHENQDTPSPFHTTHLPHAETSRKPRYSTPLYTATLTTSRPMIYNILAIRSSVTFCSDFCWIDIDLTGVTGVLWMRMGNIIFQQIIIKRGLL,FRFGHHPRDIFTLHLPHQPPLPTPPIFPMQKPHANQDTPLPPSHHQSSPCRNLTKTKILHPPFTPPIFPMQKPHANQDTPPPFTPPH,VERSSNGIKPRATSLRPPLEYIITRLLQEQKPDRGKLDRETRTLRKRLVTLRASLVIPSAPPAQKAQKHGCSPDLIMLSLTVISSSLHPSRAAVHHSNPPPPLFHPIPSLHFILPLFLVPLRFRV,PGNKDPEETFSDITRFTCNSLCSTRPEGAKTRMQSRSHHAIPYGDIIVVASFQSSRASFQPPPPSLPSHSIPPLHPSSVSCPSSLPCR,LNHVLNRIKTMKEQDTLRHHSLHLAFREPYPTIPVSQRRQLHDNGTEISLTLSNMQRPKHLLHSPPLLLSHLNYHPKCSIHKKTQNQIESSTLIRSCPLFSITLVLYP
###Gene_Info_Comments GLEAN3_08177 ###
Transcriptome data indicates that Glean may have falsely predicted the following exons: 16.
###Gene_Info_Comments GLEAN3_20637 ###
Matches_GLEAN3_26099.
###Gene_Info_Comments GLEAN3_18126 ###
The tiling data indicates that in the vicinity of the gene are significant ORFs which might represent additional exons. ORFs: TSSGGMFHLNTPLKLGKALHQWGGIVFIFTSILWCIKIIRPIFFQAIQPTGDKAIYFSGSSDRDKRAHLMIVLPVNSRGRPLM,VQAGQSQSRSNPIEISKVSTMKLPFNLFLNSHTPFLKPCFQSACCLLIVETIRHSFLTPYSILVHVSCTNVERTLNPIIASTCSHTRLR,NRHTPTITHTSQTGCHPHVPQSSTGFNLNDKSASSTRCRFFAHIHRLIKTYGSENVDIEIRHVYSLSWQHMGASMRFVLVCSGSSGGVHKQCNVYRNRLPHRV,KPIHPRWFCSFGTGVALGMIGRFEKGQNVRLIRREKVWHNHDNGNSEPRITLRPQLSKFIWRTPSRTLLTSVSAFPLIRGIKPGGFIRQNEYLPVYANTDIERACA
###Gene_Info_Comments GLEAN3_14539 ###
Transcriptome data indicates that Glean may have falsely predicted the following exons: 1.The tiling data indicates that in the vicinity of the gene are significant ORFs which might represent additional exons. ORFs: PYPLRPIAHTPLSITIRRSLPLPCSFPFPFPSLLPLYLACPLSSLLPSLFLLFSLSFPCLLSLLPLSTSPLSILCSLLPLYFTSFST,PTPRSPSRSVDHSLSLALSPFPSPLSSLSILHVPFPLFFPLFSFSSPSLSHVYCLSSPSPPRPFLSSAPSSLFTLLHFLLDIIAVYAFPNTNSLPLLN,LPPSSHSPHPALHHDPSITPSPLLFPLSLPLSPPSLSCMSPFLSSSLSFPSLLPLFPMSIVSPPPLHLAPFYPLLPPPSLLYFIFYLILLQCMLSQTPTLCLFLIEF
###Gene_Info_Comments GLEAN3_16343 ###
Transcriptome data indicates that Glean may have falsely predicted the following exons: 1.The tiling data indicates that in the vicinity of the gene are significant ORFs which might represent additional exons. ORFs: SHTNPSFFQSLYLPKEWYKSLLTYSVCFPRNSSEDTKIAIKSRDKNPFPFLLYDLLFMITVHFPRGLLLHSDKSPSLSQDV
###Gene_Info_Comments GLEAN3_02677 ###
The tiling data indicates that in the vicinity of the gene are significant ORFs which might represent additional exons. ORFs: LHASPPCSPSPYLHLFFPPSLSIPSTMIAPRITSLLSVSLSPSFLPSIALHPFRCDHLAPRIPSLLSVSLSPSFLPSIALHPFRHDHL,SLYFSFSLSFRLPISPSILPSVVLHPFRHDHVAPRIPSLLSVSLSPSFLSSIALHPFHHDSSTHHLLAFRLPISIFSSLHRSPSLPL
###Gene_Info_Comments GLEAN3_02603 ###
The tiling data indicates that in the vicinity of the gene are significant ORFs which might represent additional exons. ORFs: RARSEKRSTSLSKLLIDLPINCQKYARYCSEIRAAVSLFKQNSSNCLQCANLLRTILVQYTSHCYCYFIHLLYLVTSSRLDTSFVCLTSNFEETVNCHFYILYNEQSGNNRIRIR
###Gene_Info_Comments GLEAN3_05435 ###
Matches_GLEAN3_23739. The tiling data indicates that in the vicinity of the gene are significant ORFs which might represent additional exons. ORFs: CMSFSLLVLYLFFCISLSISPSLPLHYASLSSFNLCLSVSVSLSLSLSLSLFPFLSLSLPFSIMVMSLSIHLPSLIFPLPLSFYVHLGLFLLLCLSFPIF
###Gene_Info_Comments GLEAN3_23739 ###
Matches_GLEAN3_05435. The tiling data indicates that in the vicinity of the gene are significant ORFs which might represent additional exons. ORFs: VIITTTIIIIIISNLTTPPLSLTMTLEIPFKRGWDLSRRGCSSLCLVNTDKLNNVYTVCVVCHPFSTHRINGTFPCMCGHLQGILRVRTTISFSPCHL
###Gene_Info_Comments GLEAN3_04598 ###
Matches_GLEAN3_06159.
###Gene_Info_Comments GLEAN3_06159 ###
Matches GLEAN3_04598. Transcriptome data indicates that Glean may have falsely predicted the following exons: 2,4.
###Gene_Info_Comments GLEAN3_14461 ###
Matches GLEAN3_24163. Transcriptome data indicates that Glean may have falsely predicted the following exons: 1.
###Gene_Info_Comments GLEAN3_24163 ###
Matches_GLEAN3_14461.
###Gene_Info_Comments GLEAN3_25486 ###
Matches_GLEAN3_25486. Transcriptome data indicates that Glean may have falsely predicted the following exons: 2,3.
###Gene_Info_Comments GLEAN3_23868 ###
Matches_GLEAN3_23868.
###Gene_Info_Comments GLEAN3_20346 ###
The tiling data indicates that in the vicinity of the gene are significant ORFs which might represent additional exons. ORFs: RASLLYPTFPDSLCLSLSLSLSLSVCVCVCVCVCVCVCLPPSLSLSLSLPALLAFSIFPSPHIFFTLSFLDSILHLGQPVFLSSLSLSLSLYMYLPTLSFSNPPSFCYVFLNYLFAYSLSKFSFFFLNL,LISSPLFQYCHVFICHNTISHPLNAPLYSTLLSLILSVSLCLSLSLCPSVCVCVCVCVCVCVSPPLSLSLFLSQLSWPSPSSPLPIFSLP
###Gene_Info_Comments GLEAN3_14177 ###
Transcriptome data indicates that Glean may have falsely predicted the following exons: 1.
###Gene_Info_Comments GLEAN3_11315 ###
Matches_GLEAN3_11315. The tiling data indicates that in the vicinity of the gene are significant ORFs which might represent additional exons. ORFs: LHVCRLLQAICLSSSLFPLTLSSVSSSLKCSHFPIYSLSSSSPFSLSVSLPPSLSLSPSPFHSPPLIILCLSGDASSLALSLPSMPAHFTAVC,KCIEGDYMYVVCFRQSVCLLPSFPSPSPVSLHLSNAPISPSTPSLLRLRSLYLSLSLLPSLSLHLPFILPRSLFSVSLAMLLRLLFPSRPCRPTSQLFA,ILRDENVLRVITCMSFASGNLSVFFPLSPHPLQCLFISQMLPFPHLLPLFFVSVLFICLSPSFPLSLSISLSFSPAHYSLSLWRCFFACSFPPVHAGPLHSCLP,HQHSLPVAPCKTPCRSRETIPIIFFHFKTLMQPPQMTNIVQYFIIKTRARGLLCTFEAALPNPPPLLLSLSLLHHPQTIQSTHPRPKQMPGKFTNDHTSGGSRIL
###Gene_Info_Comments GLEAN3_17983 ###
Matches_GLEAN3_17983.
###Gene_Info_Comments GLEAN3_18954 ###
The tiling data indicates that in the vicinity of the gene are significant ORFs which might represent additional exons. ORFs: LDDFKRINKPNGVDDNLQKKYQVLERHVTSLRQYGQKMNQRLEMLETSNNGVCVWKIANYNEKKKDAMKTNVKSICSPPFYTSQYGYKLCGRVFLMGDGVGKGTYISLFLTIMKGSFDAVLPWPFKERITFQLVNQDDSINKSIVEAFRPDPASSSFKKPTTEKNIGAGCPLFAKIQIIEDPKSGFIRDNTMYLKIICQTSDVPEIK
###Gene_Info_Comments GLEAN3_06753 ###
The tiling data indicates that in the vicinity of the gene are significant ORFs which might represent additional exons. ORFs: XXXXXRLHLSGVWCSLSDNAAGSETLDFMCLVFSVKPLDLEVFYLHYLKFGVRQGYWIWSTDLTAIIWCEVPANVTSFGALHLT
###Gene_Info_Comments GLEAN3_13305 ###
The tiling data indicates that in the vicinity of the gene are significant ORFs which might represent additional exons. ORFs: ALDKGNTWQFPELHPNWNLNATSNLMKIFSHIFIRSTFLSLLILSVPVPMITSITTIPKIFNLVVANYHVFTKNTSSYQLKI,ISLSGGRGHLCDCLGGQSSSSIEAAYCYSHGRVGDGVTSPSTSSSLTKTTSTKLSPHSSLSSSSGCYIIPNLMGHVIYLIIFLWDSPCNYFI
###Gene_Info_Comments GLEAN3_08936 ###
Matches_GLEAN3_08936.
###Gene_Info_Comments GLEAN3_27487 ###
Matches_GLEAN3_08936.
###Gene_Info_Comments GLEAN3_16650 ###
The tiling data indicates that in the vicinity of the gene are significant ORFs which might represent additional exons. ORFs: GMSKVIWYSLLQSDLSAEEIQAWLDRKVHGTVLESIHTCRDTDPAMQDNNYARYERSDKALKPLIGQVMCTRPSISCATCRSSIVNFPSPGASEDQLEQLGLREQVSMVYMIEWMCCTIF,SYYVDKHDQRERTEPSPFSFRLPSVALFSHLTFHPLFCFQFDIKGKCPHSRATPLMLLICPCTKASVNKTSDIGRRRPLTGSPLFSV
###Gene_Info_Comments GLEAN3_06150 ###
Transcriptome data indicates that Glean may have falsely predicted the following exons: 4,7.
###Gene_Info_Comments GLEAN3_15456 ###
The tiling data indicates that in the vicinity of the gene are significant ORFs which might represent additional exons. ORFs: RSWTEAMTERIRDVHILREGERERGGDREKERERRRERERGDRGLEEMRKKAYAWWEKQQCSIHRRVQSSMRMYPGGMTGCYARPQDHRKQKRKTTTTTT
###Gene_Info_Comments GLEAN3_00485 ###
The tiling data indicates that in the vicinity of the gene are significant ORFs which might represent additional exons. ORFs: SLSVSLYIYIFPLFFPFFDTTLSPSLPPFFYFSTTQDLYMSPSIFYFYFSSLFHLYLSLSLSLSLFISLSPSLSLSHSFLSSYSVLSSLSVSSFSFFLSPTENA,LSLSLSIYISFPYSFPFLTPLFLHPFPPFSIFPPHKTSTCLPLFSISISLRSSISISLSLSLSLSLSLSLPLSLFLTPSSLHIPSFLLSPSPLSLSFFLQLRTR,HHSFSIPSPLFLFFHHTRPLHVSLYFLFLFLFALPSLSLSLSLSLSLYLSLSLSLSFSLLPLFIFRPFFSLRLLFLFLSFSN
###Gene_Info_Comments GLEAN3_02592 ###
The tiling data indicates that in the vicinity of the gene are significant ORFs which might represent additional exons. ORFs: IFATKRPYIVVAVQPVIKCELLYILFLPMSPLHAPSLPFFLDNIIWSSSFLCLSISLKYPPDTNATFSNIITIITQNRLLLSPSISPILPFVYIK,VAGSKKKTLVSFISCPKPESGRACTPLVPLWPVFTRCKPPKRGQTLLDIEWMRTYLIRLPPSRLLCATLEIFSQTYESNS,SLLSLSLTPLTSLSYLPPFPFLPSLSFSYSSSLSSSSSFPLLLLLFVFLFVFFPFPFSRLLSLFLSFSVSPSSWVLAGGSRGHCPQISCPCRCPSFGHIVMWLLMCLSSTMTQQ,HFSLFLLPLSLHYPIFLPSLFSPLSLSLILLLYLLLLLFLFFFFFLSSSSCSSLSPSLVSSPCFFHSQSHPLPGSLQEDLEGIAPRFRAHADVLRLVTLSCGC,FVNRRTPLTQNLSLRLLSQPYHLPPPSLSLTPFLFQFNHTSTRIHCLIGLFSDKRGNKSTKLMSRPDDDTISKTNMSERINYQNIMERGCKSNSSCDGNGCIGIDG
###Gene_Info_Comments GLEAN3_16685 ###
The tiling data indicates that in the vicinity of the gene are significant ORFs which might represent additional exons. ORFs: PLVTYAYLPDCLSGSSRGQVRKARRSGACFVEFQFHFCLPSVSLPLSPSLSLSLSLPLSPSLHLFLHDSFHPSLTLFSPLQ,LPTLTFLTVYRGAPGVRYEKLVARGPVLSSSNFIFAFPPSLSHSLPLSLSPSLSLSPLLSISFFMTLFIPRLLFFPPCNENGLYVSHLSVV
###Gene_Info_Comments GLEAN3_13843 ###
Transcriptome data indicates that Glean may have falsely predicted the following exons: 2,4.
###Gene_Info_Comments GLEAN3_13689 ###
Transcriptome data indicates that Glean may have falsely predicted the following exons: 1,2.
###Gene_Info_Comments GLEAN3_27446 ###
Matches_GLEAN3_27446.
###Gene_Info_Comments GLEAN3_13178 ###
The tiling data indicates that in the vicinity of the gene are significant ORFs which might represent additional exons. ORFs: MEQVDPPKDDATSSESLGEQREKTVPDVNSAGDQCSKESDGNEGKEETAKIPNSKEEDPALPSTSTGEEGMTADSSSHDDPEGNADEKMEESKDTDDKIEERQGTDDKGAKQVDGDDQLEEGEDRNNEHPGREPRDAEFTSEIFKIMLRNLPTRFGFQVGV
###Gene_Info_Comments GLEAN3_18392 ###
The tiling data indicates that in the vicinity of the gene are significant ORFs which might represent additional exons. ORFs: LFSAGKLDKSGSPRHIFDDGIAAKRPRITLPPAPRLRSLPYNTPPVAPLRSEAHRREVAPQTQPSFHPRHGQAVTSPNDIEDQRQVLVSDHAQRPARSHLVQSHHILQRNHLQRQQQHHHHHLLPQQHSLVSLLREPVVTTSPAFERLGIGPRAVTGNEAGSASGMPQTRASPVCDSCTDGAGCWKEMTGIGCKLETKELWDRFHELGTEMII
###Gene_Info_Comments GLEAN3_00129 ###
Matches_GLEAN3_00129. Transcriptome data indicates that Glean may have falsely predicted the following exons: 4,9,12.
###Gene_Info_Comments GLEAN3_22971 ###
Matches_GLEAN3_22971.
###Gene_Info_Comments GLEAN3_15425 ###
The tiling data indicates that in the vicinity of the gene are significant ORFs which might represent additional exons. ORFs: FKSFQKTTNGGMSVSFSPNLSSNLEKNTLQNFDLNCHDKTVFYDIGHRSRLMQDMCTYKTFMQKKISQNNIQEHYYLFEEKQNGRRNASNPCSTVSISILNIRILIP,HDLNLSKRQQTGECLFHFLLILAAIWKRIRSKTLILTAMTKLYSMILDTEAGSCKTCVLIKPSCKRRYHRIIYRSITIYLKKSRMEGEMHPILAALCQFPF,LGVTLRDRRRNKEIRKELKVGNILELARDMRLRWFGQSEWADEGKPAKDRMTRAVEGSRGRGRPETCWKEGYLKKELNLTAAQTGNRREWRLRIRPTNPC
###Gene_Info_Comments GLEAN3_11080 ###
Transcriptome data indicates that Glean may have falsely predicted the following exons: 1.
###Gene_Info_Comments GLEAN3_05022 ###
This is the bHLH domain of Sp-ahr.  The C-terminal sequence is  either in glean3_13788 (more complete) or glean3_12296 (one PAS domain only).  
###Gene_Info_Comments GLEAN3_25737 ###
Matches_GLEAN3_25737 The tiling data indicates that in the vicinity of the gene are significant ORFs which might represent additional exons. ORFs: AIFFNSLESLTREALPSSPTLESVLPYPESSVNLPNAPPACSILVRGVLSEEFSSTLGPVLLGVLSSVGRGRINEVMSEPGGALSLFAGTSWPVLG,RILLFTVAFCLSHIFQQLGVPHKRSPAIIPHTGIGASIPRVQCKSSKCTTGLFHSRKGGAVRGVQLYFRTSALGCIVFSRAGENQ
###Gene_Info_Comments GLEAN3_25589 ###
Matches_GLEAN3_25589 The tiling data indicates that in the vicinity of the gene are significant ORFs which might represent additional exons. ORFs: ITSFMRQGPDLLLSSIRDLLTDNGHNILETIPLYLIQFSLVKRRRKELSLRQGTIRRTHFFEFVSEKAFSVIRKQIFFIFPQQ,IYSIYFTGGGELSNQLMFPTSMLSGSTSKFTSPNIFSKLEFSIESFPESLPSTYLLGVIIRDLRELRIPPDVLGMFLFTGDNMWCLGVLECPTRLL
###Gene_Info_Comments GLEAN3_11348 ###
Matches_GLEAN3_11348. Transcriptome data indicates that Glean may have falsely predicted the following exons: 5.The tiling data indicates that in the vicinity of the gene are significant ORFs which might represent additional exons. ORFs: TGGEPRQNIWTLIRSRRYGTTSSRSYDRLYMENNGLNWWRTPVESPDINPIEKVWNDLKRFLRQIVHGKQWTQLVANPSRISRH,TGGEPQQNLQISIRSRRYGTTLRGSYERSYMENNGLNWWRTPAESPDINPIEKVWNDLKIDIYIYIIIFAFFFILYTREWKTATKPEHLIGIEVF,CWKRCSLFASGSICNFEGHCRRSLTKNVKYRSIETETAVDHGVLNLEQEKSSKFLIIKYRLEGTRMKSLALLSRIVESSIARFIVESFQGPGSSP,VSNTFLFCQQVPHIFDTSGDKVKVRTEVYSHDALKMWWAALIACGIYIHAHLTMLCQRLINKYREKFMYGAPTIISIARPG
###Gene_Info_Comments GLEAN3_27598 ###
Matches_GLEAN3_27598. The tiling data indicates that in the vicinity of the gene are significant ORFs which might represent additional exons. ORFs: RISRRSELSNVKGSFLLRETTSELATSRVQFRIGFSTIQYSTDGTISRSPAKVMLEALQYICLRKHLQLRRPLQNGHRQRMINTVV
###Gene_Info_Comments GLEAN3_24486 ###
Transcriptome data indicates that Glean may have falsely predicted the following exons: 2,3.
###Gene_Info_Comments GLEAN3_14405 ###
Transcriptome data indicates that Glean may have falsely predicted the following exons: 8.
###Gene_Info_Comments GLEAN3_17375 ###
Transcriptome data indicates that Glean may have falsely predicted the following exons: 1,2,3,6,7.The tiling data indicates that in the vicinity of the gene are significant ORFs which might represent additional exons. ORFs: EELTVEDGKDSHVPPLLIYLNDEYSESVLDMLLLAHLKIEDKIGIYHVLPLHCWTHQFHYHWTGFYLHHSTAISEWEVQLLRYYSLGQGSVVFCLLSLPSRHELDEYYRHHLLY,PPIPYQDGIATKIGAKPTFKSLFLKDPILALKCFFGPAVPASYRLQGPHVWSGARDTIMNVWQNTVSGTKFRDTPIANGPEGYPIALKLIFLVCIVAGLYLAMM
###Gene_Info_Comments GLEAN3_00749 ###
ligand binding domain is found in glean3_11061
###Gene_Info_Comments GLEAN3_27623 ###
Matches_GLEAN3_27623.
###Gene_Info_Comments GLEAN3_19444 ###
Matches_GLEAN3_19444.
###Gene_Info_Comments GLEAN3_28093 ###
The tiling data indicates that in the vicinity of the gene are significant ORFs which might represent additional exons. ORFs: KPRWTYKQVACQNITAAKARQKWHKQSSKIGCSGGVSRFSVCTLPILMSVILFINESNLKLGSKLRIRGVVPYFVPLNNSLAIIEAKNRITWAKFLNRHKHVL
###Gene_Info_Comments GLEAN3_28148 ###
The tiling data indicates that in the vicinity of the gene are significant ORFs which might represent additional exons. ORFs: SYLSLHPYLFVCLPVFLSFCLSGFCRPLHILTPPVYICLLVCLSVCLSVWLSLCLRLSVCSCLYISLSLSVCPCMPVCLSVCVCMRVWVGGGVRVSLSFSILATPQPSSNYPALSPSLPLSQSLSHSFSLSSSLSF,LILSFSPSIPLCLSACLSVFLSLWLLPTTPHLNSPCLYLFVSMSVSLSVCMVVSLSPPLCLFLSVYISFSVCLSVYACLSVRVRMYACVGGWGRTCVSLFFNLSDPSTILKLSSSFPFSPPLPISLPFFLTLFISLILTLMSLFSLWAQLHSDPPS,ERQIKREKYRDKVCVSVLPVCVCVRVRERERERERGRERGGGAGKREKGVNERDRERERNTEIKCALVCCLCVRVCVCVCVCM
###Gene_Info_Comments GLEAN3_03704 ###
see glean3_09520 for information about correct glean model assembly.

note missing sequence:
ITPKCGVPNVFPSPLRLGE

also there is either alternate splicing, or a extraneous exon.
###Gene_Info_Comments GLEAN3_11576 ###
Transcriptome data indicates that Glean may have falsely predicted the following exons: 5.
###Gene_Info_Comments GLEAN3_01519 ###
Matches_GLEAN3_01519. Transcriptome data indicates that Glean may have falsely predicted the following exons: 5.
###Gene_Info_Comments GLEAN3_03920 ###
Matches_GLEAN3_03920. Transcriptome data indicates that Glean may have falsely predicted the following exons: 4.
###Gene_Info_Comments GLEAN3_27215 ###
The tiling data indicates that in the vicinity of the gene are significant ORFs which might represent additional exons. ORFs: VKWVCACYERYYFVLSCRFMVLNIITTDGGVVQITIYGGGGGVKSRLETHLCLHVKHSARYKRPSHSSAITRNGNVLSHTPVTLTILAPDRHQTDTRPTPDRHYPEQ
###Gene_Info_Comments GLEAN3_12491 ###
The tiling data indicates that in the vicinity of the gene are significant ORFs which might represent additional exons. ORFs: NSLLSCREQVSRKLAKLYFLHLGIRGGPISLIMITLASKIGLLKFTWKKVFDTGIANPLALMSPRHRSSQIQYFSTSTAGNRFKLSLSVKWDRPRILRNQSSSSIILRTFFQFIKNMMVSLIRYHFLMFLI,REREWWRMRYMQGGEDERGRDRWLDREKGRKRDRERERERRGYKWVQKIDHGLFEIDLYHHLSIPLFLPLHTIISCQKSKHSNACLTKYSLSLFKP,TTEPQTVFVCHVCSITLHLYLPLSLSLSITLYFSFSCSFYYYFSPFFHSHYFSSNALLPPHFRSLALSLSTNHLVSFSRSLLSLLFIFSYSPLRSLYGYDTCMWQIYRGS
###Gene_Info_Comments GLEAN3_13047 ###
Transcriptome data indicates that Glean may have falsely predicted the following exons: 4.
###Gene_Info_Comments GLEAN3_11297 ###
Transcriptome data indicates that Glean may have falsely predicted the following exons: 3.
###Gene_Info_Comments GLEAN3_27334 ###
Transcriptome data indicates that Glean may have falsely predicted the following exons: 3,4,6.The tiling data indicates that in the vicinity of the gene are significant ORFs which might represent additional exons. ORFs: YCCPKPFRLVTATLRTMVKLTNEAMDIFSKTLSFLNIEPKMINATRPVGRPFSLGKEDVSVFHAVRHDHHHHIITMSDHHLMLIPDMNDEVTDAH
###Gene_Info_Comments GLEAN3_18351 ###
The tiling data indicates that in the vicinity of the gene are significant ORFs which might represent additional exons. ORFs: TQPCFKLNVFENRDARVKKCRLNAPLAKHPDKTKPSCPFLHLRPGSKLLCLQICAKRLGVGNSEKLRNFEACFIFFIKLLPIWFPQLILFCTQGLSYQDCYFPPIFIYFRFGIFNQGNRRSVIF,VSKSMVFTNTHLFLPFLISVGEALKVRFLLIKFAYSTSTLSLYIRSAPSGTYRIPVISMCDIILNIQLYANGNDHTFSNIKD,LSLVLLTIFYDTGRRSREFRLRAEPHFRSVFVYSSILFPYFQKLLNSKYSKINCFHGNLERNFPMPLNDCMRRIAATQCDL
###Gene_Info_Comments GLEAN3_07981 ###
Transcriptome data indicates that Glean may have falsely predicted the following exons: 5.The tiling data indicates that in the vicinity of the gene are significant ORFs which might represent additional exons. ORFs: PSISKSPFLYQSSIIPPPFLHHSSTIPPPFLHHSSTGRLSFLHHSSTIPPPFLHHSSTSLLPFLHQSSSIPPPFLHHSSTSCLQFLHHSSTSPLLFLHQSSTNPPPLFHHPSTTPTLNLYQSFTNHNPKCLPH,FLHHSSTIPPPFLHHSSTIPPPVVYHSSTIRLPFLHHSSTIPPPVFFHSSTSRLPFLRHSSTIPPPVVYNSSTIPPPVLYYSSTNPPPILHHSSTIPPPLPHSISTSHLPITIPSVFPI,KSIPLPILHNSSTIPPPFLHHSSTIPPPFLHRSSIIPPPFVYHSSTIPPPFLHQSSSIPPPVVFHSSAIPPPFLHQLSTIPPPFLHQSSTIPPPILHQSSTTLPPSLHHSHTQSLPVIYQSQSQVSSPY
###Gene_Info_Comments GLEAN3_04414 ###
The tiling data indicates that in the vicinity of the gene are significant ORFs which might represent additional exons. ORFs: ASARVLCLYSLCIVRFLILPSDLCQGFPNLQWVHFCRRCHLHVHSMCMMDLSSSCKSFNICSNLQFKPFRSLPQFRLITFGFPVCRSHFKEILNYKFMA
###Gene_Info_Comments GLEAN3_00424 ###
Matches_GLEAN3_00424.
###Gene_Info_Comments GLEAN3_04844 ###
Matches_GLEAN3_04844.
###Gene_Info_Comments GLEAN3_17725 ###
The tiling data indicates that in the vicinity of the gene are significant ORFs which might represent additional exons. ORFs: KRGIRSASTQLIVHFVRLAFGYLHVKFFYLTCLHVCKMLIEIQQYYGLFNSMLTKEHCTLHVVAYSKGDREQVIFLNGFPCSRLKC,CPSHNMHSSLFQILVANMMSWLYSSQHSTHFNISICFAKFVIFHPGHVFLIRVSGRICSCSIASCKHTNCVFLPIHVYCLGCSHYSYNFFSIIRVGIKLYSPYLIRLYSEFTFF
###Gene_Info_Comments GLEAN3_01998 ###
Transcriptome data indicates that Glean may have falsely predicted the following exons: 2.The tiling data indicates that in the vicinity of the gene are significant ORFs which might represent additional exons. ORFs: NTTCAARKLVYANETQNHHLFPIIRIMLFCHIFGIDLHLCGPYTGMHFHCKKYIKNEMIMSSPKINERIFPDVLASLLRSPFEIEEAPVAACVELYSSLINTTRCRRGNSEHNFIFA,LIRHDVEEATVSIISFLLSQPASQGVPHHTNYKYFHFIRASHCPRAFSFWLLAGFARIFGSWGRGFSCCSCDNRGKENMVGKEEEKNRFNIVSKKRVVAESVYHRLPRST,FSNNHHQSFLFLPSLSLSLSLCLSLSFCRLIIVCELFLYQSMASPLQSPPPIVRLPPLGTKLSTGWSCMHAEKTPQFWSP,LFSPFLSFLIITISHSYFYPLSLCHSLSVFLSLSVALSLFVSFSCTSQWLAPFNPLHPLFACLLWAPSCPQVGHACMQRRLHSSGPPKWPCLPCVVLRLPFFGGIDSSPIVSASVFLSPHSSGFGS,MKSIYKCNCSSCFNLGIVSKLESTAPCILPSLQLSGSNANQNKDILGKPFVETQWGCWSNWRRDDWPCDLKCHAGNIPTTSNSTDTDCLNTYVMVLVMNVFGYVCTCAVVFTLR
###Gene_Info_Comments GLEAN3_21210 ###
The tiling data indicates that in the vicinity of the gene are significant ORFs which might represent additional exons. ORFs: GEISCFSVHRLVTFLKVLFFRKCYLGSIKGSPHRHDDDRSSYNHQVNSRKYSVNMLMTSSLMHCGSICRKRTATFLCGLSLNLETSVEPLNVLTYMY,DYVHSNRRGCTPVSRVVCCRDIWQCGTKGSLCCAGDRDVYYRTELSRLVRRNALLCTCMICTLTEMAFLKGFDRFQTVSRLHIVCCSKRFCHFYRSTLYS
###Gene_Info_Comments GLEAN3_26962 ###
The tiling data indicates that in the vicinity of the gene are significant ORFs which might represent additional exons. ORFs: CVFHSFVYCKYDKLRKGDDRSQDLGSLQCVSYKDVIWTFTTGHKILSSSIDMILLFSCVTSLVYVCHYFMYVFLTPSPVPFVYPFILIETCSVLLVNSFTSEDT
###Gene_Info_Comments GLEAN3_10404 ###
Transcriptome data indicates that Glean may have falsely predicted the following exons: 1.
###Gene_Info_Comments GLEAN3_19268 ###
Transcriptome data indicates that Glean may have falsely predicted the following exons: 2.
###Gene_Info_Comments GLEAN3_18951 ###
Transcriptome data indicates that Glean may have falsely predicted the following exons: 2,3.
###Gene_Info_Comments GLEAN3_14157 ###
Transcriptome data indicates that Glean may have falsely predicted the following exons: 8.The tiling data indicates that in the vicinity of the gene are significant ORFs which might represent additional exons. ORFs: LHLPILIYSSLIPSSTYPLVHLTFSFSYSFLLSIFPILLFSPRPASHDCVPRQPRFYPKPHLISCIHPVLHVCQESVLINLPTKLDPSFHSRLSTCMTV
###Gene_Info_Comments GLEAN3_14170 ###
The tiling data indicates that in the vicinity of the gene are significant ORFs which might represent additional exons. ORFs: QREEESEKEEKKGEGRKEKRKECRIESNLGFPISRVPILRLELKRGRGKERERERERGRGKERDIITCQKPLEESFRKEYNKTTAPGISDKQKLVYFKTSLSISQGALIHFKHVSQTHAQDNHYF,FLSCRGKLMMAEGNACMIYIRNTRVESSMCMVTNKVVWLDTKQKNVCQNIMITRPPGTVMRPTFFFLMKRRKDMNSCRARSKIWRKWVDGDIKQKVCAYVCVNEENVNSATVVWGQDCIK,LSTMLSTQEYQYRIMFRVTKLPRAVLHFFPVVFDVSISWQLPPPYFLLLSFFLTPPSNIFLPLFISPSLYLSICTMTFNGRRLKLHILKSNHTLPRSYQTPELAEGESRSTRD
###Gene_Info_Comments GLEAN3_19129 ###
Transcriptome data indicates that Glean may have falsely predicted the following exons: 2,3,5,7,8.
###Gene_Info_Comments GLEAN3_12122 ###
The tiling data indicates that in the vicinity of the gene are significant ORFs which might represent additional exons. ORFs: CNTFNVFIGILQPINELQGEGDKNRSLTWQKGFCVACDLSVCYDIQISDIHSIFAAVFCRQLLAVMLEHGLILYSRYVLQGTFLLSSPCHKERVVEFLSFSKVNFAAESMLVL
###Gene_Info_Comments GLEAN3_01739 ###
The tiling data indicates that in the vicinity of the gene are significant ORFs which might represent additional exons. ORFs: RYKHYAENYFTITAKQENVKRKSCNSIRLTVVQCIVIYNYFCKNQRGFLTPLEQEATKDSRKMRHLYNVPCHSIPSLFVAFNSLYFLFPQKPPNSPLLIESQNY,NPICCENLLFFRSHHPIPCGLARYNLKYGRRNKNLISSTGSRNRKREKRNFVLQRQGGVLERNNRLGDAVTWILERMRVKREFIHLSTA
###Gene_Info_Comments GLEAN3_18366 ###
Transcriptome data indicates that Glean may have falsely predicted the following exons: 3,4.The tiling data indicates that in the vicinity of the gene are significant ORFs which might represent additional exons. ORFs: KFTRIWSKLLGRTNICGFITNGSAVVGREGILSRSCFSDQCLKIGFVSPSPHPKGGHGVASGGGGGSKHANCWNLRCDMCNCNWCMLMNK,KFTRIWSKLLGRTNICGFITNGSAVVGREGILSRSCFSDQWSEDRVCFALTSSERGSRSGLGGGGGNLNTRIVGICVVTCVTGIFASSHIYMCMCIY
###Gene_Info_Comments GLEAN3_28827 ###
The tiling data indicates that in the vicinity of the gene are significant ORFs which might represent additional exons. ORFs: TDNLQVALHKKGLKNVKPFFAVIIDIFSTFERNNISVPSQKGVNSVLEDMSLKCLSGSTYINACNANEKPTCLRESDGEFPNSNQVGSCSNPLEKILVIVNTTLTLEYS
###Gene_Info_Comments GLEAN3_14802 ###
Transcriptome data indicates that Glean may have falsely predicted the following exons: 2.The tiling data indicates that in the vicinity of the gene are significant ORFs which might represent additional exons. ORFs: LDNHYVILKFGKEMLKVSRIHHQPRAGPPPPPNILFLLFLLLLLLLLLLLLLLLLLLLLPLPWTWELQQQIIFLSFFLFFFREL
###Gene_Info_Comments GLEAN3_11202 ###
Transcriptome data indicates that Glean may have falsely predicted the following exons: 6,7.The tiling data indicates that in the vicinity of the gene are significant ORFs which might represent additional exons. ORFs: RENNLQHQSTGNSGLFSASFLHSPFLLSSLPSFIHPTNPKVELIDTKLAQMLLNVKRSAEGIRIRLHIVERWGRNETESLWMQTDQSDMYLDRDR
###Gene_Info_Comments GLEAN3_26877 ###
The tiling data indicates that in the vicinity of the gene are significant ORFs which might represent additional exons. ORFs: QNCGCLAYLFTVVVMFYVNVRLIWKYTLKKESITLFESKTSFFSEALCNCENTVCALLNEMSVCLRVRVCVCEAEGGRERGTDRDEEKGDETRGSENVRLAEILEDEDAGWKEEGEPKKRSYDAHVLYWNLVLSSSLSWVPGATRALTK,NVGFRFLSKSNLFSLVAHDSFAIECTKYNYSACKYLNKKKTLFSQDILLFIKGHVSFVIALPPKMLKIFLHRIYFYIHAPTAESYSNDFRIGGKSNCILSPSLYNSSTILP
###Gene_Info_Comments GLEAN3_16168 ###
The tiling data indicates that in the vicinity of the gene are significant ORFs which might represent additional exons. ORFs: QEIVRWPALCHTPSLPPFHSATTPPTRVDLSVTPPTFIVYSFPFYTYQTQTNKKQTKNICVKERRTVRPSTLYFSILTFLYFI
###Gene_Info_Comments GLEAN3_26905 ###
Transcriptome data indicates that Glean may have falsely predicted the following exons: 9.The tiling data indicates that in the vicinity of the gene are significant ORFs which might represent additional exons. ORFs: MKSSAVFCNSSMASSFISSIISSTSFFTFFFTFFLTFFLTFFLALSLAFFFAFFFTFFFALSLTFFFTFFFAFFLAFFFTCFFTRFFACFFACFLTCFLACFLICFFATFFPVFLAALLPPFLAADPATFIILTVC,FIYGIILHILHHIVYFFFHLLFYFLLDLFLDFLLGLILGLFLRLFLYFLLRFVLDLLFHLLFRLLFSLLLHLLFYSLLCLLLCLLLDLFLGLLFDLFLRHFLSGFLGRPLATFLGR,NLPDDEENNLQNNLDEPWPFISCQRACSGYVLFKDLYGWPSTLILLLAALVFSFLLFFFLPWRHFPVHFFLIIVIRPFACT
###Gene_Info_Comments GLEAN3_17404 ###
Transcriptome data indicates that Glean may have falsely predicted the following exons: 1,7.
###Gene_Info_Comments GLEAN3_07242 ###
The tiling data indicates that in the vicinity of the gene are significant ORFs which might represent additional exons. ORFs: QKVEDEHSSRPESLPDHHTDFQFLGELGSSQSLEYFLGKLGLSQSLEYFPEGQDCSSYEWNSYESIENADELTYNCTWCSIPKPCREGRGKQNNISSKI,LSMKLQCWVKNLRRLGGTRRRHNSRRLKTSTPAVRNRCQITTQTFSFLVNLGRLRALNIFLVNLGCLRALSIFPKVKIVVPTNGIPMRA
###Gene_Info_Comments GLEAN3_10438 ###
The tiling data indicates that in the vicinity of the gene are significant ORFs which might represent additional exons. ORFs: LLPTPFQVFLHNRKQMCSIIHFLYVIALNEMQNVENVELIQVRVNFEASVHTALNECRLELHVCLCLLFFMSYMNIIHSPSELH
###Gene_Info_Comments GLEAN3_05831 ###
This gene model doesn't have a TIR domain. But the nucleotides encoding SP, NT, LRR(16-21), CT, TM has 97% identity to another Sp-Tlr gene(GLEAN3_18838). So it could be a member of Toll-like receptor family.
###Gene_Info_Comments GLEAN3_24820 ###
Blasted to PTPRA and PTPRT, but phylogenetic analysis showed that it is part of a novel clade also containing PTPRLec1, PTPRLec2, PTPRLec3, PTPRLec4, PTPRLec5, PTPRFn1, and PTPRFn2.
###Gene_Info_Comments GLEAN3_09602 ###
GLEAN3_09602 may be a duplicate prediction for GLEAN3_28213.
###Gene_Info_Comments GLEAN3_11286 ###
This sequence apparently encodes the first exon of Sp p38, the rest of which is contained by GLEAN3_10118. Amino acids 1-47 (approximately) corresond to p38. Correct sequences below:

DNA:
ATGTCTGCTTTTCATTCACTGCCAGAGGACTTCCATCACATTGAACTCAATAAAACGATATGGGAAGTCCCCAATCGGTATGTGCGACTGGAACCTGTGGGCTCAGGAGCGTATGGGCAAGTATGTTCAACAGAA

Protein:
MSAFHSLPEDFHHIELNKTIWEVPNRYVRLEPVGSGAYGQVCSTE
###Gene_Info_Comments GLEAN3_28213 ###
GLEAN3_09602 may be a duplicate prediction.
###Gene_Info_Comments GLEAN3_19852 ###
Blasts to PTPRA, but forms a novel clade in phylogenetic analysis with PTPRFn1, PTPRFn2, and PTPRLec2-6.  
###Gene_Info_Comments GLEAN3_00237 ###
partial sequence only, internal. Also seems to be missing an exon.
###Gene_Info_Comments GLEAN3_19770 ###
GLEAN3_19770 codes the first exon(s) for this gene. Rest of the gene is present in GLEAN3_19769.
###Gene_Info_Comments GLEAN3_19769 ###
GLEAN3_19770 codes the first exon(s) for this gene. Rest of the gene is present in GLEAN3_19769.
###Gene_Info_Comments GLEAN3_17187 ###
One of 3. This gene is a partial sequence, and is identical to 19022, which is longer and encompasses this gene. GLEAN3_01396 also is a JIP3, but is distinct from 19022 and 17187.  This gene and 19022 also BLAST well to XP_782498.1, sperm-associated antigen 9 isoform 1. 
###Gene_Info_Comments GLEAN3_09388 ###
Similar to phosphohistidine phosphatase 1.
###Gene_Info_Comments GLEAN3_26582 ###
Partial sequence.  See also GLEAN3_06528, GLEAN3_16411, GLEAN3_22686, and GLEAN3_18743. 
###Gene_Info_Comments GLEAN3_28046 ###
Similar to Protein phosphatase PP2A regulatory subunit A. Partial sequence.
###Gene_Info_Comments GLEAN3_01694 ###
Similar to Receptor-type tyrosine-protein phosphatase R. Partial sequence.
###Gene_Info_Comments GLEAN3_15535 ###
Similar to Receptor-type tyrosine-protein phosphatase R. See also GLEAN3_20488.
###Gene_Info_Comments GLEAN3_23889 ###
homolog: arrestin beta-1 from human, isoform B
###Gene_Info_Comments GLEAN3_20488 ###
Similar to Receptor-type tyrosine-protein phosphatase R. See also GLEAN3_15535.
###Gene_Info_Comments GLEAN3_03711 ###
Missing N-ternimus.  See GLEAN3_06561, _01698, _00076.  
###Gene_Info_Comments GLEAN3_11457 ###
Missing N-ternimus.  See GLEAN3_14876.  
###Gene_Info_Comments GLEAN3_05942 ###
See GLEAN3_14876, _11457.  
###Gene_Info_Comments GLEAN3_20542 ###
Blasts to PTPRK, but didn't clade with these genes in phylogenetic analysis.  Formed a unique clade with Glean3_15923. Partial sequence.
###Gene_Info_Comments GLEAN3_08253 ###
Similar to R-PTP-alpha.  See also GLEAN3_16053, GLEAN3_16144, GLEAN3_19852, GLEAN3_20604, GLEAN3_24537, GLEAN3_27101, and GLEAN3_22839. 
See putative conserved domains at:
http://www.ncbi.nlm.nih.gov/Structure/cdd/wrpsb.cgi?RID=1137537808-25802-57187428179.BLASTQ4
###Gene_Info_Comments GLEAN3_17329 ###
The ATG (exon 1) probably lies in an unsequenced area.  Other notes for exon 15: 
- 23 bp missing after 23565
- 16 bp mismatch btwn 23566-81
- 24 bp missing after 24183 & 24237
###Gene_Info_Comments GLEAN3_06155 ###
glean modified to correspond to est.
Seems like it is missing the C-terminal part of the expected protein. It could be contained in the prediction glean3_17460, but the est doesn't line up with the exon sequences there
###Gene_Info_Comments GLEAN3_24537 ###
Similar to R-PTP-alpha.  See also GLEAN3_08253, GLEAN3_16053, GLEAN3_16144, GLEAN3_19852, GLEAN3_20604, GLEAN3_27101, and GLEAN3_22839.
###Gene_Info_Comments GLEAN3_28020 ###
it seems that this prediction has two matches to Nef3 in C-terminus and N-terminus.
###Gene_Info_Comments GLEAN3_15941 ###
It hits to the same query Mouse Rufy3 as GLEAN3_28460. They maybe the same gene,maybe not. It is named as Sp-Rufy4.
###Gene_Info_Comments GLEAN3_28184 ###
has one kazal and two TY domains - like a splice isoform of SMOC (Q9H4F8)- a SPARC homologue
###Gene_Info_Comments GLEAN3_02025 ###
Contains single NtA domain like N-terminus of agrin.
Other GLEAN  predictions contain FOLN and KAZAL repeats and may comprise the next segment (especially GLEAN3_02467 and possibly GLEAN3_24994).  A fourth gene looks like the next piece (GLEAN3_22633)and the adjacent gene (GLEAN3_22634) contains LamG repeats that look like the C-terminus. These five gene predictions may be adjacent and comprise a full agrin gene.
###Gene_Info_Comments GLEAN3_17460 ###
glean describes c-teminal part of the gene
potentially could be the c-terminal part of the gene descibed as glean3_06155, but no linking est data is available
###Gene_Info_Comments GLEAN3_15605 ###
Hh signaling pathway member
###Gene_Info_Comments GLEAN3_19022 ###
One of 3. This Glean is identical to and encompasses GLEAN3_17187, both of which BLAST to JIP3 as well as sperm-associated antigen 9. In addition, GLEAN3_01396 is also a JIP3, but does not match these others and so probably represents a true duplication.
###Gene_Info_Comments GLEAN3_24688 ###
See also GLEAN3_05592 and GLEAN3_06723.
###Gene_Info_Comments GLEAN3_05592 ###
See also GLEAN3_06723 and GLEAN3_24688.
See putative conserved domains at:
http://www.ncbi.nlm.nih.gov/Structure/cdd/wrpsb.cgi?RID=1137537110-23138-78233751102.BLASTQ4

And this sequence is a duplicate of GLEAN3_06723
###Gene_Info_Comments GLEAN3_26916 ###
thanks to Charlie W.
###Gene_Info_Comments GLEAN3_01396 ###
One of 3. GLEAN3_17187 and 19022 also encode a JIP3, distinct from this gene.
###Gene_Info_Comments GLEAN3_25413 ###
Similar to c-myc binding protein.
###Gene_Info_Comments GLEAN3_18908 ###
The tiling data indicates that in the vicinity of the gene are significant ORFs which might represent additional exons. ORFs: FTFTCYLSGQKKDPTILVLNRRSLLINIHEVPHPRAIWFYCYPPLHLNSSQPPDEYYKYSRSLCCVFVSCLCLYQAARSGL
###Gene_Info_Comments GLEAN3_07911 ###
Using SpADAM cDNA, two predictions align GLEAN3_0911 and GLEAN3_20545.  Both are nearly identical from 1010 to 3072.  Missing exons are mostly on Scaffold_317, but some are only on Scaffold_663 (2 exons encoding 681-865).  I have corrected gene features of GLEAN3_20545.
###Gene_Info_Comments GLEAN3_20545 ###
Using SpADAM cDNA, two predictions align GLEAN3_0911 and GLEAN3_20545.  Both are nearly identical from 1010 to 3072.  Missing exons are mostly on Scaffold_317, but some are only on Scaffold_663 (2 exons encoding 681-865).  I have corrected gene features of GLEAN3_20545.
###Gene_Info_Comments GLEAN3_10565 ###
looks like part of slit - the C-terminal half - perhaps the last few domains are artefacts/duplication - missing N-terminal half with more LRR repeats

ADJACENT GENE (GLEAN3_10564) - LOOKS LIKE THE N-TERMINAL HALF - could be one or two exons encoding LRR repeats missing at junction
###Gene_Info_Comments GLEAN3_16527 ###
novel architecture - TSP1 plus LamG x2 - no homologs known
###Gene_Info_Comments GLEAN3_18348 ###
FA58C-LamG-LamG structure defines this as relative of CASPR - probably missing C-terminal half that should contain other domains (maybe FBG,more LamGs and EGFs, TM and/or 4.1m)

The prediction from GLEAN3_07341 looks like a likely candidate for C-terminus
###Gene_Info_Comments GLEAN3_06742 ###
Hh signaling pathway regulator
###Gene_Info_Comments GLEAN3_23409 ###
Has EGF and BNR repeats but no N-terminal reeler domain - probably a fragment.

GLEAN3_08268 is very similar in structure
###Gene_Info_Comments GLEAN3_08268 ###
Has EGF and BNR repeats but no N-terminal reeler domain - probably a fragment.

GLEAN3_23409 is very similar in structure
###Gene_Info_Comments GLEAN3_23757 ###
An cDNA containing everything but the 5' end of the gene was used in a BLAST search of the contig database, which resulted in the identification of an overlapping sequence that starts with a signal peptide.  The first exon appears to be on the - strand whereas the other exons are + strand, suggesting an assembly error. 
###Gene_Info_Comments GLEAN3_27145 ###
The full-length cDNA was assembled from several overlapping cDNA fragments and ESTs and confirmed by PCR of a full length ORF.  Gene features have been altered to comply with cDNA.
###Gene_Info_Comments GLEAN3_28236 ###
Likely the unique ortholog of human CDC2L5 and CrkRS
"Additional evidences of the existence of the gene" have been obtained in Sphaerechinus granularis
###Gene_Info_Comments GLEAN3_28463 ###
The C-terminal portion of this GLEAN is clearly MAPKAPK5, whereas the N-terminal portion appears to be CG12134-PA (BLAST XP_788954.1). The sequence below has been modified to correspond only to MAPKAPK5. The MAPKAPK part is partially duplicated by GLEAN3_21161.

MWSLGVIIYIMLCGYPPFYPDTPSRQLSKDMRHKIMAGQYEFPTEEWSLISDEAKDVVKRLLRVDPTERLTIEELCSHPWLRENSAPNTELHSPAIMLDKNMLDDAKQIHSEQLTAMRIPDKKVMLKPVAKANNPIVRKRILTRGQSIDNKIGEEQPPKKQNRENSEGVTCLRNIIAHCIVPPKDANGEDALCELMKRACQYNRDCPSLDKALNNLSWNGEQFCDKVDRSELALLLKDIVDQKERHEKC
###Gene_Info_Comments GLEAN3_23951 ###
GLEAN3_23951 lacks C-terminal SH3 domain present in homologs.
###Gene_Info_Comments GLEAN3_21161 ###
partial sequence,  identical to GLEAN3_28463 from aa 35-249 (with one exception). Mismatched short ends do not appear to result from frame shifts. Also BLASTs to XP_781571.1 (MAPKAPK5) with high e value.
###Gene_Info_Comments GLEAN3_13910 ###
Duplicate gene (non-identical) to other MAPKAPK5s: GLEAN3_28463 and _21161 (these later 2 appear to be the same gene). The termini of this gene appear to be incorrect.
###Gene_Info_Comments GLEAN3_23676 ###
Also BLASTs strongly to XP_789413. Identical and internal to GLEAN3_06513
###Gene_Info_Comments GLEAN3_20611 ###
This gene appears to be missing an exon encoding SLLHLITQYLNPRTLSKDFQGK (aas 213-234). 
This is an overlapping identical duplicate of GLEAN3_25452.
###Gene_Info_Comments GLEAN3_25452 ###
Also BLASTs strongly to XP_791076.1. Appears to be an overlapping identical duplicate of GLEAN3_20611
###Gene_Info_Comments GLEAN3_11714 ###
sequence is only partial. GLEAN3_20782 is an internal identical duplicate of this gene. 
###Gene_Info_Comments GLEAN3_20782 ###
internal identical duplicate of GLEAN3_11714. Also BLASTs strongly to XP_797035.1
###Gene_Info_Comments GLEAN3_26498 ###
This is the C terminal part of the protein; the N terminal portion is encoded by GLEAN3_27848. These 2 gleans overlap (nucleotide level): bases 1-353 (this  glean).  
###Gene_Info_Comments GLEAN3_17694 ###
duplicate, non-identical to GLEAN3_10805. Partial sequence.
###Gene_Info_Comments GLEAN3_07222 ###
appears to be missing the start codon, but 3rd aa is present
###Gene_Info_Comments GLEAN3_27370 ###
The N-terminal sequence (exons from nt 8030 to 19367) is not part of Sp-CDK7. This sequence is similar to the sequence NP_000327.1 encoding sodium channel, nonvoltage-gated 1, beta [Homo sapiens]. 
Likely due to a problem of contig assembly.
The N terminus of Sp-CDK7 is missing and the C-terminus (two ultimate exons)is conflictiv.
###Gene_Info_Comments GLEAN3_04845 ###
adhesion protein or cell surface receptor - novel architecture - FBG, an N-terminal  MNNL Notch ligand domain and multiple EGF-Ca repeats. Good Blast match with Notch homolog but that may be spurious (EGFs).
Could be a Notch or Notch ligand but does not have ankyrin repeats characteristic of Notch or DSL characteristic of Notch ligands
###Gene_Info_Comments GLEAN3_24020 ###
novel architecture - C-terminal FBG domain - might be related to a role in immune defense. Has pfam:Nacht domain - NTP-binding??

Blast match to tenascin is probably misleading
###Gene_Info_Comments GLEAN3_11551 ###
Based on best blast hit data, this protein is closely related to tolloid but lacks the C-terminal EGF, CUB and CUB domains.  One C-terminal predicted exon and two N-terminal exons do not encode conserved sequence and may not be part of this gene. 
###Gene_Info_Comments GLEAN3_07341 ###
LamG-EGF-LamG-4.1m - looks like C-terminus of CASPR or neurexin.
Neurexin gene (GLEAN3_24416) has its C-terminus - CASPR gene (GLEAN3_18348) does not. Suggests this is part or CASPR gene.
###Gene_Info_Comments GLEAN3_14828 ###
All predicted exons supported by EST data.
###Gene_Info_Comments GLEAN3_12138 ###
EGF-LAMG-LAMG-EGF

these two domains occur in intermingled fashion in quite a few known proteins - neurexins, perlecans, CASPR

URCHINS APPEAR ALSO TO HAVE NOVEL EGF/LAMG PROTEINS
###Gene_Info_Comments GLEAN3_15404 ###
membrane-proximal portion of an adhesion receptor - a bit like Crumbs - has several LamG/EGF pairs and a TM domain

these two domains (or other EGF variants) occur in intermingled fashion in quite a few known proteins - neurexins, perlecans, agrin, crumbs, CASPR, some cadherins

urchins appear also to have novel EGF/LAMG proteins

###Gene_Info_Comments GLEAN3_22078 ###
EGF-LAMG-LAMG-EGF-LAMG

these two domains (or other EGF variants) occur in intermingled fashion in quite a few known proteins - neurexins, perlecans, agrin, crumbs, CASPR, some cadherins

urchins appear also to have novel EGF/LAMG proteins

###Gene_Info_Comments GLEAN3_24257 ###
EGF-EGF-LAMG-LAMG-EGF-EGF-LAMG

these two domains (or other EGF variants) occur in intermingled fashion in quite a few known proteins - neurexins, perlecans, agrin, crumbs, CASPR, some cadherins

urchins appear also to have novel EGF/LAMG proteins

###Gene_Info_Comments GLEAN3_04895 ###
Multiple EGFCa repeats and C-terminal Lamg/EGFCa modules
No obvious TM domain
Novel architecture
Essentially same structure as >GLEAN3_16555
Look a bit like Crumbs but much larger and not the same domain organization
###Gene_Info_Comments GLEAN3_16555 ###
Multiple EGFCa repeats and C-terminal Lamg/EGFCa modules
No obvious TM domain
Novel architecture
Essentially same structure as >GLEAN3_04895
Look a bit like Crumbs but much larger and not the same domain organization
###Gene_Info_Comments GLEAN3_16807 ###
Multiple EGFCa repeats and three LamG domains interspersed before a TM segment
Looks rather like Crumbs in overall organization but larger.

GLEAN3_20365 is similar
###Gene_Info_Comments GLEAN3_09927 ###
This model is on a short scaffold and is probably lacking Both N and C-terminal exons.  One predicted exon, given below, cannot be validated by sequence similarity to members of the M12A class of proteases.
>GLEAN3_09927|Scaffold82736|3265|3440| DNA_SRC: Scaffold82736 START: 3265 STOP: 3440 STRAND: + 
GAGAAGAAGAAGAAGAAAAAGATGATGAAGAAGAAGATGAGGAGGAGGAGGATGATGAAGAAGAAGAAGA
AGGAGAAGATGAAGAAGAAGAAGAAGAAGAAGAAGAAGAAGAAGAAGAAGAAGAAGAAGAAGAAAAAGAA
GAAGACAAACAAGAAGAGGAGGAGGAAGACGAAGAA

###Gene_Info_Comments GLEAN3_20365 ###
Multiple EGFCa repeats and three LamG domains interspersed before a TM segment
Looks rather like Crumbs in overall organization but larger.

GLEAN3_16807 is similar
###Gene_Info_Comments GLEAN3_21812 ###
Likely unique ortholog of human cyclin T1 and T2
###Gene_Info_Comments GLEAN3_28742 ###
This model contains exons encoding a protein most similar to SpAN, a sea urchin astacin protease. but lacks a C-terminal domain found in SpAN.
###Gene_Info_Comments GLEAN3_17070 ###
Comparison to best blast hit show that the following exons are not conserved with other M12A class proteins:
>GLEAN3_17070|Scaffold1519|80742|80817| DNA_SRC: Scaffold1519 START: 80742 STOP: 80817 STRAND: + 
ATTTCTGCAGCCTGTCGCGAGAGATATTGGTTGCGGTTGCAATCTCGACGAGACTCTTCCAACATATGAA
GAGCAT
>GLEAN3_17070|Scaffold1519|82021|82153| DNA_SRC: Scaffold1519 START: 82021 STOP: 82153 STRAND: + 
GATGACGCCGAATTTTGAAAATCAACTCTGTATGTCTGCCCATCGATTGGTACCGAGTCGTCATGTAGAA
AATTGATACACGATCTGTTTGATATGTACTGCATAGTTTCCTCAATTACAGTTCTGAATGATT
>GLEAN3_17070|Scaffold1519|83467|83734| DNA_SRC: Scaffold1519 START: 83467 STOP: 83734 STRAND: + 
CTAACTCGCTATCGATCTCATACGGTAGTGTTGCGTCGGGCCATGTCGCTCCAGTTTCCACATTCCTCTT
GGTCCTACTCCCGTTGCCATTGTGACCATCCTCTTCCATGAACTTCTTCTGCTCTTCAGTAAGGCGGATA
TCTCCCAGGATGACGTCACCTGGATTCAGATTGTCCATTGGTTTGCTATGCTGTTCGGACTCAGCTTCCG
CATTGTGAGGGCGCGCCAACACCGTATCGTCAACGTCTTTCTTGAATGGTGGCAGAGA
>GLEAN3_17070|Scaffold1519|85911|86121| DNA_SRC: Scaffold1519 START: 85911 STOP: 86121 STRAND: + 
TGATTTTCCCTTGTCATCGACGAAATCATGGTCGTGATCACTGTTGAGAGGAAGACGTTCGTCATCGACT
GTCGTCGTGAATACGGCGACAGCTAGGCAGAGTAACAGTACAGAGCTCAGGCAAATCCTTTTCATCATTT
TTTCGTTCCGGCGTTTCGGCAAAGCGACGAGATTCTCCAAACCAACGGAGTGACAGTAGTAAGCAGCAGT
C

###Gene_Info_Comments GLEAN3_09203 ###
Model probably contains partial CDS because it is at the end of a scaffold.  Of the 5 predicted exons, only 3 and 4 contain conserved sequences.
###Gene_Info_Comments GLEAN3_14989 ###
The N-terminal sequence of this cyclin is probably the one encountered in GLEAN3_11295.   
Three GLEAN: GLEAN3_00328,14989 and 0011295 encode the cyclin L protein. They differ in the N-terminal end.

###Gene_Info_Comments GLEAN3_00218 ###
The model contains exons encoding CUB domains similar to those found in tolloid-like proteins within he M12A metalloprotease subfamily.  This sequences are very similar those in GLEAN3_17070, but are two divergent to be allelic.  This is very likely to be partial CDS.
###Gene_Info_Comments GLEAN3_27114 ###
Comparison to best blast hit sequence suggests that the gene model contains exons encoding a tolloid/BMP-1 like protein.  The model is likely to be partial because there are only 3 cub domains instead of the 5 normally associated with this subclass of astacin proteases.  Note that this model is adjacent to a very similar gene, GLEAN3_27115.
###Gene_Info_Comments GLEAN3_27115 ###
Comparison to best blast sequences suggests that this model contains exons encoding a protein related to tolloid and bmp-1.  It is like to be partial because there is only 1 cub domain rather than the 5 characteristic of proteins in this subclass of M12A.  Note that it is adjacent to a very similar gene, GLEAN3_27114
###Gene_Info_Comments GLEAN3_19518 ###
identical to glean3_00129 over >200 aa; possible mis-assembly
###Gene_Info_Comments GLEAN3_26758 ###
strong identity to HIF2a through first 239 aa
###Gene_Info_Comments GLEAN3_08353 ###
needs to be split
aa 1-~900 = similar to Biotin protein ligase
aa ~900-1676 = similar to SIM but missing N-term.
N-terminal likely found in GLEAN3_13962
###Gene_Info_Comments GLEAN3_21277 ###
gi|72046985|ref|XP_786603.1|  PREDICTED: similar to ataxin 2 [Strongylocentrotus purpuratus]Length=898

###Gene_Info_Comments GLEAN3_24739 ###
Comparison to best blast sequence suggests that all but the first and last exons in this model are conserved with tolloid-like proteins.  While the first predicted exon may be part of this gene, the last one ( >GLEAN3_24739|Scaffold97632|8420|9724| ) encodes peptide sequence similar to other proteins.
###Gene_Info_Comments GLEAN3_03612 ###
This model contains exons encoding an astacin protease of the tolloid family most closely related to the sea urchin protein SpAN, but lacks other exons characteristic of this group of proteins, such as CUB domains probably because they are on other scaffolds and this model is located at the end of scaffold 72693.
###Gene_Info_Comments GLEAN3_04586 ###
short with TM/LDLA/TY domains
###Gene_Info_Comments GLEAN3_05781 ###
NOVEL ARCHITECTURE - TY-TY-WAPx5-VWCx4-WAP domains
###Gene_Info_Comments GLEAN3_12001 ###
NOVEL ARCHITECTURE - EGFCa interspersed with 3 TY repeats
###Gene_Info_Comments GLEAN3_19601 ###
NOVEL ARCHITECTURE - WAPx4-TY-x3-EGFx3  domains
###Gene_Info_Comments GLEAN3_27371 ###
LNB-7TM receptor - lots of calx-B repeats

HOMOLOGOUS WITH "VERY LARGE G PROTEIN-COUPLED RECEPTOR 1, VLGR1/MASS1/GPR98 - mutated in Usher syndrome 2C

PREVIOUSLY CHORDATE RESTRICTED
###Gene_Info_Comments GLEAN3_00957 ###
probably an ECM protein given its domain composition - TSPN - many VWC -VWD at C-terminus - novel architecture
GLEAN3_04940 has very similar strucure minus the TSPN
the FN1 predictions are likely alternative predictions for the VWC repeats
###Gene_Info_Comments GLEAN3_04940 ###
probably an ECM protein given its domain composition -  
many VWC and VWD at C-terminus - novel architecture
GLEAN3_00957 has very similar strucure plus a TSPN domain at N-terminus
the FN1 predictions are likely alternative predictions for the VWC repeats
###Gene_Info_Comments GLEAN3_27525 ###
C-terninus of this gene is GLEAN3_27526 and should be combined.  
###Gene_Info_Comments GLEAN3_26798 ###
This is part of a sea urchin specific group of ADAM-TS genes.  There are two worm ADAMTS genes with some sequence similarity (wormbase# F08C6.1a.1 and C02B4.1)
###Gene_Info_Comments GLEAN3_12296 ###
aa 1-235 strong id to ss and AHRs
See also Glean3_13788 and Glean3_05022 for AHR-like sequences.
Glean3_05022 may be the bHLH domain of this model or of Glean3_13788.
###Gene_Info_Comments GLEAN3_16604 ###
GLEAN3_15419 model on scaffold 76434 is also part of predicted Sp-Pask.   FgeneshAB prediction S.P_Scaffold70175 may have additional exons, based on alignment with mammalian PAS-K.
###Gene_Info_Comments GLEAN3_02947 ###
Only domains it contains is a portion of a metalloprotease domain.
###Gene_Info_Comments GLEAN3_13296 ###
only domain is the reprolysin domain
###Gene_Info_Comments GLEAN3_13297 ###
only domain it contains is a part of the reprolysin domain
###Gene_Info_Comments GLEAN3_25061 ###
contains TSP1 repeats found in ADAM-TS sequences
###Gene_Info_Comments GLEAN3_01545 ###
contains a TSP1 and calcium-binding domain

###Gene_Info_Comments GLEAN3_05275 ###
PREDICTED: Strongylocentrotus purpuratus similar to Machado-Joseph disease protein 1 (Ataxin-3) (LOC581652), mRNA

###Gene_Info_Comments GLEAN3_05234 ###
Comparison to best blast sequence suggests that the model contains exons encoding an astacin protease.  It lacks domains characteristic of the closely related astacins, tolloid and BMP1.  One of the predicted exons (>GLEAN3_05234|Scaffold6580|26133|26420) probably belongs to another gene.  The inferred amino acid sequence from the last two predicted exons (>GLEAN3_05234|Scaffold6580|26133|26420; >GLEAN3_05234|Scaffold6580|30923|31151| ) is not conserved.
###Gene_Info_Comments GLEAN3_10948 ###
This model encodes a protein with the same domain architecture as the sea urchin protein SpAN; the primary sequence indicates that it is a different gene.
###Gene_Info_Comments GLEAN3_01560 ###
Comparison to best blast sequence suggests that the model contains some but not all of the domains characteristic of tolloid/bmp1 proteins.  The last 4 predicted exons given below contain sequences similar to other kinds of proteins and therefore may not be part of this model.
>GLEAN3_01560|Scaffold70398|20995|21158| DNA_SRC: Scaffold70398 START: 20995 STOP: 21158 STRAND: + 
CCTTGGGCCTTGAGAGTTATGTCATCCCAGATTCAAGTCTGACAGCTTCCAGTGAATTTAATGCTGACCA
TGGTGCAAAGAGAGGTCGTCTTAACCTGGCCAGAGTCGGGGATCTGCGTGGAGGCTGGAATCCAATGGAC
AACGATGCAAACCCGTGGATCCAG (no blasts to something else; maybe EGF)

>GLEAN3_01560|Scaffold70398|23209|23361| DNA_SRC: Scaffold70398 START: 23209 STOP: 23361 STRAND: + 
GTGGATCTTCTGGACCTTTACCGTATCATTTCAGTTGCGACTCAAGGGCGACAAGATCTTGACCAGTGGG
TTAATAGCTACAAGCTTGCTTGGAGTACTGATGGCACGACCTTTCGCACAGTGCAGGACATTCCCGGGCC
AGGAGCTGACAAG (blasts as previous exon)

>GLEAN3_01560|Scaffold70398|23720|23890| DNA_SRC: Scaffold70398 START: 23720 STOP: 23890 STRAND: + 
ATCTTCATCGGTAATGTTGACCGCAACACCATCATGACCAACACTCTGCCTGTGTCCCAGGTTTGCCGCT
ATTTCCGCTTGATGCCTGTCAGCTGGTATAAACACATTAGTGTTCGTATGGAGATATATGGATATGGTGA
AGGCCCTGTCACAGGTCAGTATGAAAACTAG (blasts as previous two exons.

>01560
MSRTLLLSGLVAMLMAYSLAKPLRKQKGYTKTKVPQIKKVEFNGEILEIAVEEDDPFHRPIPADEGYSPNAYETDMMLNPEQEAALSDPKNSRNKRKASKDTTKYWPKKIIDQATSQHVINVPYEFGLGVDRTAIKAAMAHWQDQTCVRFEIHDRSVSSLWQHRLKFIKSDGCYSYLGLQSKIGFQDVSIGKGCTRLGTVSHEIGHALGFWHEQSRPDRDEFVTVNFANIIQDKMNAFRKHTTDDVMTNVPYDYNSVMHYGAYGFGIDAKVPTLIPKDPLSMGEIGQRLGLSYLDVKLANFMYECDSHCPGASSCHSGFRDMNCKCRCPESHKGDYCEVVALNFPGNLGNPDEQIRLKFDALDMEPFDTSSKKCLDYINIRAGGNLYYEGTDFCGNTLPPEIIADEIILSFHSDETNTNKGFHGTYTREKISALGLESYVIPDSSLTASSEFNADHGAKRGRLNLARVGDLRGGWNPMDNDANPWIQVDLLDLYRIISVATQGRQDLDQWVNSYKLAWSTDGTTFRTVQDIPGPGADKIFIGNVDRNTIMTNTLPVSQVCRYFRLMPVSWYKHISVRMEIYGYGEGPVTGQYEN

###Gene_Info_Comments GLEAN3_26547 ###
There may be an assembly problem with this model since part of the protease domain is repeated.  IN this model the order of domains is partial astacin protease, cub, cub, then what is probably the beginning of the protein - N-terminal signal peptide, activation domain, astacin protease, cub, cub.  
###Gene_Info_Comments GLEAN3_18198 ###
Matches the Lysosomal trafficking regulator from rat along the entire coding sequence.  Conservation very high at 3' end.  Tiling experiment indicates high expression in embryos.
###Gene_Info_Comments GLEAN3_22164 ###
an exon may be missing
###Gene_Info_Comments GLEAN3_16045 ###
One of 2. GLEAN3_04024 is an exact duplicate of this protein, although 04024 is shorter and missing the N terminus.
###Gene_Info_Comments GLEAN3_15349 ###
the N terminal region in particular matches AMPK-like, while the C terminus does not BLAST strongly
###Gene_Info_Comments GLEAN3_17949 ###
One of 2. An almost perfect duplicate of GLEAN3_09559. This protein is longer, appears to contain the true N terminus and an exon missing from 09559.
###Gene_Info_Comments GLEAN3_09878 ###
one of 2. Non-identical duplicate of GLEAN3_03844
###Gene_Info_Comments GLEAN3_23875 ###
One of 4. Non-identical duplicate of GLEAN3_00442, 23876, 08085.
###Gene_Info_Comments GLEAN3_05676 ###
GLEAN3_04836 is a partial duplcate prediction.
###Gene_Info_Comments GLEAN3_04836 ###
GLEAN3_04836 is a partial duplcate prediction for GLEAN3_05676.
###Gene_Info_Comments GLEAN3_09559 ###
one of 2. THis is an almost-perfect duplicate of GLEAN3_17949. This protein is missing the N terminus and an internal exon, but otherwise is an exact match.
###Gene_Info_Comments GLEAN3_26779 ###
Partial sequence identical and included in GLEAN3_24526
###Gene_Info_Comments GLEAN3_00442 ###
One of 5. Non-identical duplicate of GLEAN3_23875, 23876, 08085, 19751
###Gene_Info_Comments GLEAN3_17487 ###
One of 2. This protein appears to be a shortened version of GLEAN3_05613, which has a much longer N terminus. This protein (17487) has a slightly longer C terminus.
###Gene_Info_Comments GLEAN3_28480 ###
First half completely predicted. Last half of the gene missing. GLEAN3_17903 is a partial duplicate prediction.
###Gene_Info_Comments GLEAN3_17903 ###
First half completely predicted. Last half of the gene missing. GLEAN3_28480 is a partial duplicate prediction.
###Gene_Info_Comments GLEAN3_05613 ###
One of 2. GLEAN3_17487 overlaps C-terminus and is nearly identical. 17487 is slightly longer at C-terminus, but this protein (05613) is considerably longer in the N terminus.
###Gene_Info_Comments GLEAN3_09980 ###
- shows comparable homology to vertebrate terminal deoxyribonucleotidyltransferase (TdT) and vertebrate polymerase mu
- one intron in was skipped by GLEAN3 prediction
###Gene_Info_Comments GLEAN3_26447 ###
PREDICTED: similar to fragile X mental retardation gene 1,
autosomal homolog [Strongylocentrotus purpuratus].
###Gene_Info_Comments GLEAN3_23876 ###
One of 3. This one is the longest, and is non-identical to either GLEAN3_00442 or GLEAN3_23875
###Gene_Info_Comments GLEAN3_08085 ###
One of 5. Non-identical duplicates of 00442, 19751 and 23875. Nearly identical (and internal) to 23876, although the C terminus of this protein diverges. 
###Gene_Info_Comments GLEAN3_28711 ###
Possibly missing an exon in the middle.
###Gene_Info_Comments GLEAN3_19009 ###
Strongylocentrotus purpuratus mRNA for SuDp98 protein Length=3650

###Gene_Info_Comments SpRag2L ###
This gene has been verified by Race and RT-PCR.  It is expressed at low levels in early gastrula embryos, adult coelomocytes, and other adult tissues.  Though it has only low sequence identity with vertebrate Rag2 it is predicted to have the same structure and is encoded in reverse  orientation downstream of SpRag1L (a Rag1-like gene). 
###Gene_Info_Comments GLEAN3_01138 ###
Pfam00194 match. 

Transcriptome data indicate that it is expressed in the embryo.

A family of carbonic anhydrase-like proteins exists in the sea urchin. Orthologous relationships with vertebrate carbonic anhydrases and related proteins have not been carefully examined.
###Gene_Info_Comments GLEAN3_10623 ###
GLEAN3_10623 is a partial duplicate prediction for GLEAN3_27004.
###Gene_Info_Comments GLEAN3_27004 ###
GLEAN3_10623 is a partial duplicate prediction for GLEAN3_27004.
###Gene_Info_Comments GLEAN3_12518 ###
pfam00194 match. 

Transcriptome data indicate that it is expressed in the embryo.

A family of carbonic anhydrase-like proteins exists in the sea urchin. Orthologous relationships with vertebrate carbonic anhydrases and related proteins have not been carefully examined.
###Gene_Info_Comments GLEAN3_24809 ###
Pfam00194 match.  

Transcriptome data indicates that it is expressed in embryo.

A family of carbonic anhydrase-like proteins exists in sea urchin. Orthologous relationships with vertebrate carbonic anhydrases and related proteins have not been carefully examined.
###Gene_Info_Comments GLEAN3_14262 ###
these other Glean3 sequences had high similarity to endonuclease-reverse transcriptase: Glean3_24197, Glean3_26145, Glean3_02879 
###Gene_Info_Comments GLEAN3_24197 ###
these other Glean3 sequences also had high similarity to endonuclease-reverse transcriptase: Glean3_14262, Glean3_26145, Glean3_02879
###Gene_Info_Comments GLEAN3_26145 ###
these other Glean3 sequences also have high similarity to endonuclease-reverse transcriptase: Glean3_14262, Glean3_24197, 02879
###Gene_Info_Comments GLEAN3_02879 ###
these other Glean3 sequences also have high similarity to endonuclease-reverse transcriptase: Glean3_14262, Glean3_24197, Glean3_26145
###Gene_Info_Comments GLEAN3_08844 ###
This blasts to PPEF1, but phylogenetic analysis showed that it was a homologue of human PPEF2.  Glean3_11860 is likely the identical protein.
###Gene_Info_Comments GLEAN3_19367 ###
Glean3_22254 is a partial sequence of this entry
###Gene_Info_Comments GLEAN3_22254 ###
This sequence was a partial sequence of Glean3_19367 and has been modified to include the sequence from Glean3_19367
###Gene_Info_Comments GLEAN3_11655 ###
domains LDLa - CCP x4 - EGFCa x3 - Ig - SEA 7TM_2

No GPS but otherwise looks like a member of the LNB-7TM family of adhesion domain GPCRs
###Gene_Info_Comments GLEAN3_12383 ###
KR-FA58Cx3-CLECT-EGFCa-CUB X3-LDLa x6-LRR 7TM_1
No GPS but looks like a bit like a member of the LNB-7TM family of adhesion domain GPCRs or like a glycoprotein hormone receptor - 7TM_1 favors latter
No known LDLa, KR or FA58C members of LNB7TM GPCR family
Novel architecture
###Gene_Info_Comments GLEAN3_15198 ###
Lyx4- EGFCa- Lyx2-LDLa- EGFCax2-Ig 7TM_2

No GPS but otherwise looks like a bit like a member of the LNB-7TM family of adhesion domain GPCRs
No known LDLa or LY members of LNB7TM GPCR family
Novel architecture
###Gene_Info_Comments GLEAN3_23185 ###
EGFCa x13-Ig 7TM_2

No GPS but otherwise looks like a member of the LNB-7TM family of adhesion domain GPCRs
###Gene_Info_Comments GLEAN3_26721 ###
EGFCax3-Ig 7TM_2

No GPS but otherwise looks like a member of the LNB-7TM family of adhesion domain GPCRs
###Gene_Info_Comments GLEAN3_07507 ###
NIDO and VWD domains characteristic of mucins
###Gene_Info_Comments GLEAN3_07661 ###
NIDO and VWD domains characteristic of mucins
###Gene_Info_Comments GLEAN3_18197 ###
NIDO, IPT, AMOP, VWD,CCP - LOOKS LIKE A MUCIN but also has hyalin repeats
###Gene_Info_Comments GLEAN3_20968 ###
NIDO and VWD domains characteristic of mucins
###Gene_Info_Comments GLEAN3_25062 ###
NIDO, VWD AND EGF_CA TM - very similar structure to mucin4d of chickens
###Gene_Info_Comments GLEAN3_05955 ###
DSL and EGFCa domains characteristic of Notch ligands
Gene structure looks like partial duplication - assembly problems?
###Gene_Info_Comments GLEAN3_11976 ###
DSL and EGFCa domains characteristic of Notch ligands
Gene structure looks like partial duplication - assembly problems?
###Gene_Info_Comments GLEAN3_13510 ###
DSL and EGFCa domains characteristic of Notch ligands
Gene structure looks like partial duplication - assembly problems?
Also very short - fragment?
###Gene_Info_Comments GLEAN3_13646 ###
DSL and EGFCa domains characteristic of Notch ligands
Rather short - fragment?
###Gene_Info_Comments GLEAN3_16194 ###
contains no domains
###Gene_Info_Comments GLEAN3_10547 ###
DSL and EGFCa domains characteristic of Notch ligands
###Gene_Info_Comments GLEAN3_21193 ###
 This sequence roots the veritbrate clade containing both ADAM-TS16 and ADAM_TS18 genes
###Gene_Info_Comments GLEAN3_16016 ###
DSL and EGFCa domains characteristic of Notch ligands
###Gene_Info_Comments GLEAN3_06268 ###
only domain it contains is an ADAMs spacer
###Gene_Info_Comments GLEAN3_21044 ###
DSL and EGFCa domains characteristic of Notch ligands

Gene structure looks like a duplication - assembly problems?

###Gene_Info_Comments GLEAN3_25985 ###
DSL and EGFCa domains characteristic of Notch ligands
###Gene_Info_Comments GLEAN3_00680 ###
the sequence roots the vertibrate clade containing both ADAM-TS6 and ADAM-TS10
###Gene_Info_Comments GLEAN3_18098 ###
contains a reprolysin domain and adams spacer
###Gene_Info_Comments GLEAN3_16423 ###
similar to gamma-interferon inducible lysosomal thiol reductase (GILT) - vertebrate GILT cleaves disulfide bonds in proteins and is involved in MHC class II-restricted antigen processing.  

###Gene_Info_Comments GLEAN3_27456 ###
contains only part of the reprolysin domain
###Gene_Info_Comments GLEAN3_08756 ###
This gene roost the vertibrate clade containing both ADAM-TS7 and ADAM-TS12.  
###Gene_Info_Comments GLEAN3_14521 ###
appears to be a hapoltype, but is lacking a portion of the sequence.
###Gene_Info_Comments GLEAN3_03170 ###
This is part of a sea urchin specific group of ADAM-TS genes.  There are two worm ADAMTS genes with some sequence similarity (wormbase# F08C6.1a.1 and C02B4.1). may be a haplotype.
###Gene_Info_Comments GLEAN3_04710 ###
This gene roots the clade of AdamTS2 and ADAMTS3
###Gene_Info_Comments GLEAN3_10597 ###
contains metalloprotease and reprolysis domains which are the first parts of an ADAM-TS gene.  
###Gene_Info_Comments GLEAN3_04088 ###
only domain it contains is the reprolysin domain
###Gene_Info_Comments GLEAN3_18171 ###
haplotype found
###Gene_Info_Comments GLEAN3_18228 ###
Similar to Tyrosine-protein phosphatase 10D precursor.  Receptor-linked protein-tyrosine phosphatase 10D.  Partial Sequence
###Gene_Info_Comments GLEAN3_26428 ###
Blasts to Ppm1h, but is not homologous to human Ppm1h in PP2C subfamily tree.
###Gene_Info_Comments GLEAN3_03844 ###
One of 2. Non-identical duplicate of GLEAN3_09878
###Gene_Info_Comments GLEAN3_14936 ###
Also BLASTs to XP_783422.1 (it's a tie). One of 2, identical duplicate of GLEAN3_22395
###Gene_Info_Comments GLEAN3_22395 ###
Also BLASTs to XP_783422.1 (it's a tie). One of 2, identical duplicate of GLEAN3_14936
###Gene_Info_Comments GLEAN3_04024 ###
One of 2. This is a shorter, identical duplicate of GLEAN3_16045. This lacks the N terminus
###Gene_Info_Comments GLEAN3_25169 ###
One of 2. GLEAN3_24676 is an identical duplicate in most of the N terminal portion (although the extreme N termini diverge); however the C terminal portions are divergent. The divergences are not due to frame shifts. This GLEAN appears to contain a start codon, unlike 23676.
###Gene_Info_Comments GLEAN3_24676 ###
One of 2. GLEAN3_25169 is an identical duplicate in most of the N terminal portion (although the extreme N termini diverge); however the C terminal portions are divergent. The divergences are not due to frame shifts. This GLEAN does not appear to contain a start codon, unlike 25169.
###Gene_Info_Comments GLEAN3_01928 ###
This is the closest genbank match to the published hyalin clone. That sequence was incomplete, and the hyalin repeats appear in many genes so it is unclear whether this is the "authentic" hyalin, or whether there is a family of matrix proteins expressed in embryos.  The identity between the cloned gene and this glean model covers much of the model with missing Hyalin repeats at the N terminus.
###Gene_Info_Comments GLEAN3_14069 ###
Similar to SpRag1L (GLEAN3_27600), a sea urchin Rag1-like gene.  One of the more complete matches.  Probably a pseudogene.  Matches Rag1 core region, but c-terminal matching (SpRag1L 789-879) is attached to N-terminal. Region of match is SpRag1L: 380-874, ~39% AA identity. Siminlar to GLEAN3_09909.  
###Gene_Info_Comments GLEAN3_16839 ###
One of 3. Partially overlaps with 08255, which is identical in the N terminal part of the overlapping sequence, but divergent in the C terminal part. Also overlaps (in a distinct region) with 08254, which is identical and entirely contained in 16839. Note that 08254 and 08255 do NOT overlap.
###Gene_Info_Comments GLEAN3_08254 ###
This protein is identical and completely internal to GLEAN3_16839.
###Gene_Info_Comments GLEAN3_08255 ###
Partially duplicated by GLEAN3_16839. The overlapping region is identical in the N terminal part, but divergent in the C terminal part, protein and nucleotide.
###Gene_Info_Comments GLEAN3_04671 ###
Non-identical to other IKKs: GLEANs 16839, 08254, 08255. GLEAN3_11356 is a shorter, internal identical duplicate, although 11356 seems to contain a spurious stretch of amino acids (see that sequence).
###Gene_Info_Comments GLEAN3_07638 ###
One of 2. GLEAN3_27909 is an almost identical duplicate 
###Gene_Info_Comments GLEAN3_27909 ###
One of 2. GLEAN3_07638 is an almost identical duplicate
###Gene_Info_Comments GLEAN3_00053 ###
52 hyalin repeats, 10 EGF repeats plus 6 other exons.  incomplete gene at end of scaffold 1258.  Scores at high level in tiling experiment against embryos.  Placed 5th in hyalin family because its match against hyalin1 is in HYR domains and is significant but intermittant.
###Gene_Info_Comments GLEAN3_08700 ###
Glean3_11110 is a partial sequence with an  exact match to this sequence
###Gene_Info_Comments GLEAN3_11110 ###
This is a partial sequence that is an exact match with Glean3_08700 
###Gene_Info_Comments GLEAN3_02435 ###
Similar to Sidekick 2. Partial sequence.
See putative conserved domains at:
http://www.ncbi.nlm.nih.gov/Structure/cdd/wrpsb.cgi?RID=1137536477-1679-116181527840.BLASTQ4
###Gene_Info_Comments GLEAN3_19632 ###
This is the closest match to Sp-hyalin1. It matches at HYR's.  It has 18 hyalin repeats and 9 EGF repeats in 27 exons.  It spans the complete scaffold and is incomplete
###Gene_Info_Comments GLEAN3_28412 ###
One of 2. GLEAN3_16505 is a nearly identical duplicate
###Gene_Info_Comments GLEAN3_00806 ###
possibly an FGF-R, probably a RTK
###Gene_Info_Comments GLEAN3_03870 ###
This gene model doesn't have a TIR domain. But the nucleotides encoding SP, NT, LRR(12-22), CT, TM has 90% identity to another Sp-Tlr gene(GLEAN3_20741) and it is located at the end of a contig.  So it could be a member of Toll-like receptor family.

###Gene_Info_Comments GLEAN3_02779 ###
see also Sp-HSP70(3)A
First described by,
AUTHORS   Foltz,K.R., Partin,J.S. and Lennarz,W.J.
  TITLE     Sea urchin egg receptor for sperm: sequence similarity of binding
            domain and hsp70
  JOURNAL   Science 259 (5100), 1421-1425 (1993)
###Gene_Info_Comments GLEAN3_22798 ###
The BLAST hit is quite weak and only picks up the STK domain. 
###Gene_Info_Comments GLEAN3_02418 ###
One of 2. GLEAN3_06947 is an identical duplicate, but is missing the N term. THis protein appears to contain the start codon.
###Gene_Info_Comments GLEAN3_25210 ###
Duplicate prediction for GLEAN3_07944.
###Gene_Info_Comments GLEAN3_25819 ###
Duplicate prediction for GLEAN3_26605
###Gene_Info_Comments GLEAN3_09909 ###
Similar to SpRag1L (GLEAN3_27600), a sea urchin Rag1-like gene.  One of the more complete matches.  Probably a pseudogene.  Matches Rag1 core region, but c-terminal matching (SpRag1L 753-862) is attached to N-terminal (As for GLEAN3_14069). Region of match is SpRag1L: 399-862, ~41% AA identity.   
###Gene_Info_Comments GLEAN3_23673 ###
This Glean3 sequence appears to be a duplication; Glean3_23674 and Glean3_23672 are on the same scaffold and also match to endonuclease reverse transcriptase
###Gene_Info_Comments GLEAN3_23674 ###
This sequence appears to be a duplication; Glean3_23672 and Glean3_23673 are on the same scaffold and also match to endonuclease reverse transcriptase
###Gene_Info_Comments GLEAN3_23672 ###
This sequence appears to be a duplication; Glean3_23674 and Glean3_23673 are on the same scaffold and also match to endonuclease reverse transcriptase
###Gene_Info_Comments GLEAN3_15315 ###
may be trucated at c terminus
###Gene_Info_Comments GLEAN3_15136 ###
Similar to SpRag1L (GLEAN3_27600), a sea urchin Rag1-like gene.  One of the more complete matches.  Probably a pseudogene.  Matches Rag1 core region. Region of match is SpRag1L: 607-977, 38% AA identity.   
###Gene_Info_Comments GLEAN3_09908 ###
Similar to SpRag1L (GLEAN3_27600), a sea urchin Rag1-like gene. Probably a pseudogene.  Matches Rag1 core region. Region of match is SpRag1L: 557-647, 37% AA identity.   
###Gene_Info_Comments GLEAN3_26698 ###
Similar to portion of Sp-Rag1L (GLEAN3_27600). Probably a pseudogene in cobination with transposase.  Matches Sp-Rag1L N-terminal putative Zn-binding region: AA 7-105, 32%.  
###Gene_Info_Comments GLEAN3_08886 ###
Similar to UBXD2.
###Gene_Info_Comments GLEAN3_24394 ###
One of 2. GLEAN3_24395 is almost identical, but slightly shorter.
###Gene_Info_Comments GLEAN3_20853 ###
Similar to UBXD1.
###Gene_Info_Comments GLEAN3_12427 ###
Similar to UBXD1.
###Gene_Info_Comments GLEAN3_24395 ###
One of 2, GLEAN_24394 is almost identical but is slightly longer
###Gene_Info_Comments GLEAN3_19658 ###
Similar to Receptor-type tyrosine-protein phosphatase delta precursor (Protein-tyrosine phosphatase delta) (R-PTP-delta), Partial sequence
###Gene_Info_Comments GLEAN3_07508 ###
possibly an SNF-1 like serine threonine kinase
###Gene_Info_Comments GLEAN3_14978 ###
Similar to VEGF.
###Gene_Info_Comments GLEAN3_27566 ###
similar to protein tyrosine phosphatase, receptor type, D isoform 2 precursor
###Gene_Info_Comments GLEAN3_20281 ###
Similar to Tyrosine-protein phosphatase, non-receptor type 1/2 (Protein-tyrosine phosphatase 1B) (PTP-1B), partial 
###Gene_Info_Comments GLEAN3_18356 ###
has 1 cub domain and 25 HYR domains each as a distinct exon.  Low to no expression in embryos.  Likely to be a complete gene.  Similar to Sp-hyalin1 due to homology of hyalin repeats.  
###Gene_Info_Comments GLEAN3_19751 ###
One of 5. Non-identical duplicate of GLEAN3_00442, 08085, 23876. Identical to and completely internal to GLEAN3_23875
###Gene_Info_Comments GLEAN3_25766 ###
Similar to Receptor-type tyrosine-protein phosphatase mu precursor.  Duplicates
###Gene_Info_Comments GLEAN3_16500 ###
blastp shows 50% alignment to actin domain not seen in other urchin hsp70s
###Gene_Info_Comments GLEAN3_06056 ###
This glean result contains the C terminal reigon of the PLCg sequence.  The n terminal sequence is held on scaffold 53431 and GLEAN3_27462


###Gene_Info_Comments GLEAN3_10275 ###
This is the N termainal region of PLCg.  The full annotation can be found on scaffold 53431
 and GLEAN3_27462

###Gene_Info_Comments GLEAN3_10275 ###
 fragment, extra stretch in middle
###Gene_Info_Comments GLEAN3_24947 ###
3' partial 
GLEAN3_24946 belongs to 5' end of SpWntA
###Gene_Info_Comments GLEAN3_24669 ###
3' partial
GLEAN3_23065 is an identical duplicated fragment
GLEAN3_23463 belongs to 5' end of SpWnt4 

Reference:
Ferkowicz,M.J., Stander,M.C. and Raff,R.A.
Phylogenetic relationships and developmental expression of threesea urchin Wnt genes
Mol. Biol. Evol. 15 (7), 809-819 (1998)
###Gene_Info_Comments GLEAN3_23065 ###
3' partial
GLEAN3_24669 is an identical duplicated fragment
GLEAN3_23463 belongs to 5' end of SpWnt4 

Reference:
Ferkowicz,M.J., Stander,M.C. and Raff,R.A.
Phylogenetic relationships and developmental expression of threesea urchin Wnt genes
Mol. Biol. Evol. 15 (7), 809-819 (1998)
###Gene_Info_Comments GLEAN3_05686 ###
Incomplete KH-domain in the Nter end of glean3_05686 predition.
3 additional exons found in scaffold20583.
Only 2 KH domains are present on the scaffold, PCBPs family ususally contains 3. 
###Gene_Info_Comments GLEAN3_21590 ###
Bucentaur contains a LINE repeat sequence in some species
###Gene_Info_Comments GLEAN3_23298 ###
Partial sequence
###Gene_Info_Comments Sp-Tlr001 ###
Partial Toll-like receptor. The nucleotids encoding CT, TM, TIR have 96% identity to another Sp-Tlr gene(08963). This gene model occupies entire sequence of a short scaffold.

###Gene_Info_Comments Sp-Tlr002 ###
Partial Toll-like receptor. The nucleotids encoding CT, TM and TIR have 98% identity to another Sp-Tlr gene (24205). This gene model is located at the end of a short scaffold.

###Gene_Info_Comments GLEAN3_23993 ###
Duplicate prediction for GLEAN3_06932
###Gene_Info_Comments Sp-Tlr210 ###
Partial Toll-like receptor. The nucleotids encoding CT, TM and partial TIR have 98% identity to another Sp-Tlr gene(21936). This gene model occupies entire sequence of a short scaffold.

###Gene_Info_Comments GLEAN3_23091 ###
Partial sequence of a prickle protein. The sequence presents homology with the GLEAN3_23090 but is not identical.
###Gene_Info_Comments GLEAN3_05447 ###
there's an internal duplication in the predicted protein, which is most likely the assembly problem
###Gene_Info_Comments GLEAN3_01586 ###
probably missing part of carboxy end
###Gene_Info_Comments Sp-Tlr183 ###
Partial Toll-like receptor. The nucleotids encoding CT, TM and partial TIR have 95% identity to another Sp-Tlr gene (19834). This gene model is located at the end of a short scaffold.

###Gene_Info_Comments GLEAN3_14864 ###
this hsp has a speract domain on the carboxy terminus. assigned to hsp70(3) family along with other hsps containing hsp70 domain and some gamete domain. needs additional verification.
###Gene_Info_Comments GLEAN3_18199 ###
Except the first 50 aa, this sequence is also contained in GLEAN3_05021 however with further interspersed sequences.

###Gene_Info_Comments GLEAN3_04983 ###
5 prime and 3 prime wrongly predicted
5 prime is on Scaffold80207, no Glean model, my own genscan analysis identifies a 169_aa gene (exon) (pred. 8 on Scaffold80207):
 8.00 Prom +  96623  96662   40                              -5.75
 8.01 Init +  96914  97145  232  1  1   71  111   260 0.725  25.07
 8.02 Intr +  98822  98985  164  0  2   28   25   110 0.308  -2.53
 8.03 Term + 101288 101401  114  1  0   79   42   218 0.969  13.89
 8.04 PlyA + 101438 101443    6                               1.05

>Scaffold80207|GENSCAN_predicted_peptide_8|169_aa
MYRAVIYTIFVGLVCLDSVVEYGVEARRNGRKRNRNPGAGDVLSASGGDVVKVRPTPRRP
QIPLKAEVQPPHSRGVPGVQNWAQCQRLVVQLQVDAEAMRNSSNLSRQKYHFVEINLIRK
TYGTEQGDNHLVIIYFIVLSRLIIESIRFDDRMRSNNAERCDEQCRAGR

>Scaffold80207|GENSCAN_predicted_CDS_8|510_bp
atgtaccgtgcagtaatttacaccatcttcgtgggcctggtgtgcctggacagcgtggtt
gagtacggagtcgaagctcgcaggaatggaagaaagaggaacaggaatcctggagcaggg
gatgttttatctgcatccggtggtgatgttgtcaaggtgagaccgacaccaagaaggcct
cagattccactcaaagccgaggtacagcccccacattcaagaggtgttccaggggtgcaa
aattgggctcaatgtcaacgactggtagtgcaattacaagttgacgccgaggctatgcgt
aattcgagcaatttgtcgcgtcaaaaatatcactttgtcgaaataaatttgataaggaaa
acttatggtaccgagcaaggggataaccacttagtgataatctactttattgtcctttca
cgactaatcatcgagtctatacgatttgatgaccgaatgcggtcgaacaatgccgagcgg
tgtgacgaacaatgccgagcgggccgctaa



###Gene_Info_Comments GLEAN3_05021 ###
This sequence contains most of the GLEAN3_18199 
###Gene_Info_Comments Sp-TlrP41 ###
Partial Toll-like receptor. The nucleotides encoding LRR(9-19), CT, TM and partial TIR have 87% identity to another Sp-Tlr gene(GLEAN3_07850). This gene model occupies entire sequence of a short scaffold.

###Gene_Info_Comments Sp-TlrP42 ###
Partial Toll-like receptor does not have a TIR domain. The nucleotides encoding SP, LRR(3-9) have 97% identity to another Sp-Tlr gene(11537). This gene model at the end of a short scaffold.

###Gene_Info_Comments Sp-TlrP43 ###
Partial Toll-like receptor does not have a TIR domain. The nucleotides encoding SP, NT, LRR(11-22), CT, TM have 87% identity to another Sp-Tlr gene(07850). This gene model at the end of a scaffold.

###Gene_Info_Comments GLEAN3_22562 ###
Missing one (or more) exons at the beginning.
###Gene_Info_Comments Sp-TlrP44 ###
Partial Toll-like receptor does not have a TIR domain. The nucleotides encoding SP, NT, LRR(0-2) have 87% identity to another Sp-Tlr gene(15066). This gene model at the end of a scaffold.

###Gene_Info_Comments Sp-TlrP45 ###
Partial Toll-like receptor does not have a TIR domain. The nucleotides encoding LRR(5-9) have 84% identity to another Sp-Tlr gene(25312). This gene model at the end of a short scaffold.

###Gene_Info_Comments Sp-TlrP46 ###
Partial Toll-like receptor does not have a TIR domain. The nucleotides encoding LRR (0-10) have 87% identity to another Sp-Tlr gene(15066). This gene model occupies entire sequence of a short scaffold.

###Gene_Info_Comments Sp-TlrP47 ###
Partial Toll-like receptor does not have a TIR domain. The nucleotides encoding LRR (1-6) have 89% identity to another Sp-Tlr gene(18519). This gene model occupies entire sequence of a short scaffold.

###Gene_Info_Comments GLEAN3_20306 ###
May have one exon too many at the 3'-end.
###Gene_Info_Comments GLEAN3_13377 ###
Using HsSERPINE2 in BLASTP search of the GLEAN3 models yielded 15 genes. Each was analyzed by SMART and the top 12 hits were found to have a SERPIN domain and were annonated Sp-serpin-like 3 through 12 (2 were already annotated by A.Cameron). BLASTP searches of the GLEAN3 models with other vertebrate Serpins yielded the same top 12 hits.
###Gene_Info_Comments Sp-TlrP48 ###
Partial Toll-like receptor does not have a TIR domain. The nucleotides encoding LRR (6-10) have 87% identity to another Sp-Tlr gene(11541). This gene model occupies entire sequence of a short scaffold.

###Gene_Info_Comments GLEAN3_13378 ###
Using HsSERPINE2 in BLASTP search of the GLEAN3 models yielded 15 genes. Each was analyzed by SMART and the top 12 hits were found to have a SERPIN domain and were annonated Sp-serpin-like 3 through 12 (2 were already annotated by A.Cameron). BLASTP searches of the GLEAN3 models with other vertebrate Serpins yielded the same top 12 hits.
###Gene_Info_Comments GLEAN3_18632 ###
Using HsSERPINE2 in BLASTP search of the GLEAN3 models yielded 15 genes. Each was analyzed by SMART and the top 12 hits were found to have a SERPIN domain and were annonated Sp-serpin-like 3 through 12 (2 were already annotated by A.Cameron). BLASTP searches of the GLEAN3 models with other vertebrate Serpins yielded the same top 12 hits.
###Gene_Info_Comments Sp-Tlr114 ###
This gene model could be caused by assembly error. The nucleotids encoding LRR (9-17) have more than 99.5% identity to another Sp-Tlr gene(21936). This gene model occupies entire sequence of a short scaffold.

###Gene_Info_Comments Sp-TlrP49 ###
Partial Toll-like receptor does not have a TIR domain. The nucleotides encoding LRR (5-9), CT, TM have 91% identity to another Sp-Tlr gene(15066). This gene model occupies entire sequence of a short scaffold.

###Gene_Info_Comments GLEAN3_24263 ###
Using HsSERPINE2 in BLASTP search of the GLEAN3 models yielded 15 genes. Each was analyzed by SMART and the top 12 hits were found to have a SERPIN domain and were annonated Sp-serpin-like 3 through 12 (2 were already annotated by A.Cameron). BLASTP searches of the GLEAN3 models with other vertebrate Serpins yielded the same top 12 hits.
###Gene_Info_Comments Sp-TlrP50 ###
Partial Toll-like receptor does not have a TIR domain. The nucleotides encoding SP, NT, LRR(5-15) have 98% identity to another Sp-Tlr gene(05950). This gene model is located at the end of a contig.
###Gene_Info_Comments Sp-TlrP51 ###
Partial Toll-like receptor does not have a TIR domain. The nucleotides encoding LRR (12-20) have 86% identity to another Sp-Tlr gene(11537). This gene model is located at the end of a scaffold.
###Gene_Info_Comments GLEAN3_14670 ###
The GLEAN3_14670 prediction apparently missed exons 5-18, present in other gene models, coding for the highly conserved catalytic domain of synaptojanin.
###Gene_Info_Comments Sp-TlrP52 ###
Partial Toll-like receptor does not have a TIR domain. The nucleotides encoding LRR (5-12) have 93% identity to another Sp-Tlr gene(00199). This gene model is located at the end of a scaffold..

###Gene_Info_Comments Sp-TlrP53 ###
Partial Toll-like receptor. The nucleotides encoding SP, NT, LRR(16-24), CT, TM and partial TIR have 90% identity to another Sp-Tlr gene(06164). This gene model is located at the end of a scaffold.

###Gene_Info_Comments Sp-TlrP54 ###
Partial Toll-like receptor. The nucleotides encoding CT, TM and partial TIR have 93% identity to another Sp-Tlr gene (13536). This gene model occupies entire sequence of a short scaffold.

###Gene_Info_Comments Sp-TlrP55 ###
Partial Toll-like receptor does not have a TIR domain. The nucleotides encoding SP, NT, LRR (3) have 87% identity to another Sp-Tlr gene(20741). This gene model is located at the end of a scaffold. And it may represent a pseudogene or contain stop codons.

###Gene_Info_Comments Sp-TlrP56 ###
Partial Toll-like receptor does not have a TIR domain. The nucleotides encoding SP, NT, LRR (3) have 87% identity to another Sp-Tlr gene(24960). This gene model is located at the end of a scaffold.

###Gene_Info_Comments Sp-TlrP57 ###
Partial Toll-like receptor does not have a TIR domain. The nucleotides encoding  LRR (8-16), CT, TM have 88% identity to another Sp-Tlr gene(16536). This gene model is located at the end of a scaffold.

###Gene_Info_Comments Sp-TlrP58 ###
Partial Toll-like receptor does not have a TIR domain. The nucleotides encoding  SP, NT, LRR (13-22), CT, TM have 92% identity to another Sp-Tlr gene(20741). This gene model is located at the end of a scaffold.

###Gene_Info_Comments GLEAN3_08658 ###
Pfam00194 match.  Partial gene. 

Transcriptome data indicate that it is expressed in the embryo.

A family of carbonic anhydrase-like proteins exists in the sea urchin. Orthologous relationships with vertebrate carbonic anhydrases and related proteins have not been carefully examined.
###Gene_Info_Comments GLEAN3_13459 ###
Pfam00194 match.

A family of carbonic anhydrase-like proteins exists in the sea urchin. Orthologous relationships with vertebrate carbonic anhydrases and related proteins have not been carefully examined.
###Gene_Info_Comments GLEAN3_25722 ###
Pfam00194 match.  

A family of carbonic anhydrase-like proteins exists in the sea urchin. Orthologous relationships with vertebrate carbonic anhydrases and related proteins have not been carefully examined.
###Gene_Info_Comments GLEAN3_13458 ###
Partial gene. Pfam00194 match. 

A family of carbonic anhydrase-like proteins exists in the sea urchin. Orthologous relationships with vertebrate carbonic anhydrases and related proteins have not been carefully examined.
###Gene_Info_Comments GLEAN3_10471 ###
Pfam00194 match.

Pretty strong similarity to PMC EST (accession no.DN577792).

A family of carbonic anhydrase-like proteins exists in the sea urchin. Orthologous relationships with vertebrate carbonic anhydrases and related proteins have not been carefully examined.
###Gene_Info_Comments GLEAN3_00337 ###
SigPep-SRCR(4).  Possibly incomplete.  
###Gene_Info_Comments GLEAN3_00492 ###
SRCR(4) + 1 partial.  Gene probably partial.
###Gene_Info_Comments GLEAN3_00580 ###
2 SRCR domains. Probably partial.
###Gene_Info_Comments GLEAN3_28422 ###
There wasn't a good checkbox for the problem here.  This looks to be an assembly error, where 2 contigs were inappropriately joined, cramming together to unrelated proteins into one model.  The first several exons are a copy of the more properly assembled glean3_XXXXX.  Then there is a short repeated region (unmerged alleles), and finally the gene in question.
###Gene_Info_Comments GLEAN3_07587 ###
not full length
###Gene_Info_Comments GLEAN3_24316 ###
Not complete
###Gene_Info_Comments Sp-TlrP59 ###
Partial Toll-like receptor does not have a TIR domain. The nucleotides encoding  SP, NT, LRR (13-21), CT, TM have 88% identity to another Sp-Tlr gene(20741). This gene model is located at the end of a scaffold.

###Gene_Info_Comments Sp-TlrP60 ###
Partial Toll-like receptor does not have a TIR domain. The nucleotides encoding  LRR (2), CT, TM have 96% identity to another Sp-Tlr gene(08278). This gene model occupies entire sequence of a short scaffold.

###Gene_Info_Comments Sp-TlrP61 ###
Partial Toll-like receptor does not have a TIR domain. The nucleotides encoding SP, LRR (5) have 98% identity to another Sp-Tlr gene(24208). This gene model occupies entire sequence of a short scaffold and may represent pseudogene or cotain a sequence error.

###Gene_Info_Comments Sp-TlrP62 ###
Partial Toll-like receptor does not have a TIR domain. The nucleotides encoding SP, NT, LRR (0-7) have 89% identity to another Sp-Tlr gene(14352). This gene model is located at the end of a contig.

###Gene_Info_Comments Sp-TlrP63 ###
Partial Toll-like receptor does not have a TIR domain. The nucleotides encoding LRR (5-6) have 89% identity to another Sp-Tlr gene(03419). This gene model occupies entire sequence of a short scaffold.

###Gene_Info_Comments Sp-TlrP64 ###
Partial Toll-like receptor does not have a TIR domain. The nucleotides encoding NT, LRR (4-7) have 89% identity to another Sp-Tlr gene(09435). This gene model is at the end of a scaffold.

###Gene_Info_Comments Sp-TlrP65 ###
Partial Toll-like receptor does not have a TIR domain. The nucleotides encoding SP, NT, LRR (5) have 87% identity to another Sp-Tlr gene(14352). This gene model is at the end of a short scaffold.

###Gene_Info_Comments Sp-TlrP66 ###
Partial Toll-like receptor does not have a TIR domain. The nucleotides encoding LRR (5), CT, TM have 89% identity to another Sp-Tlr gene(03419). This gene model occupies entire sequence of a short scaffold.

###Gene_Info_Comments Sp-TlrP67 ###
Partial Toll-like receptor does not have a TIR domain. The nucleotides encoding LRR (5-6), CT, TM have 94% identity to another Sp-Tlr gene(09435). This gene model occupies entire sequence of a short scaffold.

###Gene_Info_Comments Sp-TlrP68 ###
Partial Toll-like receptor does not have a TIR domain. The nucleotides encoding SP, NT, LRR(8-14) have 90% identity to another Sp-Tlr gene(14352). This gene model occupies entire sequence of a short scaffold.

###Gene_Info_Comments GLEAN3_25548 ###
C terminus is missing
###Gene_Info_Comments Sp-TlrP69 ###
Partial Toll-like receptor does not have a TIR domain. The nucleotides encoding NT, LRR(4-9) have 89% identity to another Sp-Tlr gene(14352). This gene model occupies entire sequence of a short scaffold.

###Gene_Info_Comments Sp-TlrP70 ###
Partial Toll-like receptor does not have a TIR domain. The nucleotides encoding SP, NT, LRR(12-21) have 90% identity to another Sp-Tlr gene(14352). This gene model is located at the end of a contig and may represent a pseudogene or contain sequence error.

###Gene_Info_Comments Sp-TlrP71 ###
Partial Toll-like receptor does not have a TIR domain. The nucleotides encoding LRR(6-7), CT, TM, TIR(partial) have 91% identity to another Sp-Tlr gene(09435). This gene model occupies entire sequence of a short scaffold and may represent a pseudogene or contain sequence error.

###Gene_Info_Comments Sp-TlrP72 ###
Partial Toll-like receptor does not have a TIR domain. The nucleotides encoding SP, NT, LRR(14-23) have 87% identity to another Sp-Tlr gene(09435). This gene model is located at the end of a short scaffold.

###Gene_Info_Comments Sp-TlrP73 ###
Partial Toll-like receptor does not have a TIR domain. The nucleotides encoding LRR(5), CT, TM have 91% identity to another Sp-Tlr gene(03419). This gene model is located at the end of a short scaffold.
###Gene_Info_Comments Sp-TlrP74 ###
Partial Toll-like receptor does not have a TIR domain. The nucleotides encoding LRR(4-6), CT, TM have 92% identity to another Sp-Tlr gene(21225). This gene model occupies entire sequence of a short scaffold.

###Gene_Info_Comments GLEAN3_02306 ###
EGF/CCP/EGF/ZP
###Gene_Info_Comments GLEAN3_03171 ###
LY4/CUB/SR/ZP
###Gene_Info_Comments GLEAN3_04061 ###
NIDO/EGF/ZP
###Gene_Info_Comments GLEAN3_04611 ###
LY2/MAM/ZP
###Gene_Info_Comments GLEAN3_05270 ###
ZP/CCP3
###Gene_Info_Comments GLEAN3_13342 ###
SR/ZP
###Gene_Info_Comments GLEAN3_14213 ###
EGFCa3/ZP
###Gene_Info_Comments GLEAN3_16300 ###
CCP14/ZP
###Gene_Info_Comments GLEAN3_16840 ###
EGF2/ZP
###Gene_Info_Comments GLEAN3_18648 ###
CUB4/ZP
###Gene_Info_Comments GLEAN3_22873 ###
EGFCa/ZP
###Gene_Info_Comments GLEAN3_22889 ###
EGFCa/ZP
###Gene_Info_Comments GLEAN3_24217 ###
CCP/CLECT/CCP2/ZP
###Gene_Info_Comments GLEAN3_26587 ###
EGF/ZP
###Gene_Info_Comments GLEAN3_27535 ###
LY/ZP
###Gene_Info_Comments GLEAN3_28276 ###
CUB6/ZP
###Gene_Info_Comments GLEAN3_28843 ###
CCP11/ZP
###Gene_Info_Comments GLEAN3_28844 ###
CCP3/ZP
###Gene_Info_Comments Sp-TlrP75 ###
Partial Toll-like receptor does not have a TIR domain. The nucleotides encoding LRR(5-7), CT, TM have 91% identity to another Sp-Tlr gene(21225). This gene model occupies entire sequence of a short scaffold.

###Gene_Info_Comments GLEAN3_20943 ###
This sequence is identical to the one deposited in GenBank as NP_999657 (derived from RefSeq NM_214492), except for an 18-amino acid (54 nucleotide) gap between the predicted initiator methionine (i.e. residue 1) and the second predicted amino acid residue in the GLEAN sequence.  This sequence (CCTCGAGAAATTATTACCTTACAGCTAGGACAATGTGGGAACCAGATTGGGATG in the RefSeq entry) is not detected in the Baylor DNA sequence dataset by BLASTP or TBLASTN.

Annotation entered by Bob Obar (robar@scientist.com).
###Gene_Info_Comments Sp-TlrP76 ###
Partial Toll-like receptor does not have a TIR domain. The nucleotides encoding SP, NT, LRR(15-23) have 90% identity to another Sp-Tlr gene(14548). This gene model is located at the end of a short scaffold.

###Gene_Info_Comments Sp-TlrP77 ###
Partial Toll-like receptor does not have a TIR domain. The nucleotides encoding SP, NT, LRR(15-23) have 91% identity to another Sp-Tlr gene(14548). This gene model is located at the end of a short scaffold.

###Gene_Info_Comments Sp-TlrP78 ###
Partial Toll-like receptor does not have a TIR domain. The nucleotides encoding SP, NT, LRR(4-6) have 90% identity to another Sp-Tlr gene(27162). This gene model is located at the end of a short scaffold.

###Gene_Info_Comments Sp-TlrP79 ###
Partial Toll-like receptor does not have a TIR domain. The nucleotides encoding SP, LRR(2-4) have 93% identity to another Sp-Tlr gene(26274). This gene model is located at the end of a short scaffold.

###Gene_Info_Comments GLEAN3_00323 ###
 fragment
###Gene_Info_Comments GLEAN3_00359 ###
 fragment
###Gene_Info_Comments GLEAN3_00444 ###
 fragment
###Gene_Info_Comments GLEAN3_00472 ###
 fragment
###Gene_Info_Comments GLEAN3_00591 ###
 fragment
###Gene_Info_Comments GLEAN3_00592 ###
 fragment
###Gene_Info_Comments GLEAN3_00769 ###
 fragment
###Gene_Info_Comments GLEAN3_00777 ###
 fragment
###Gene_Info_Comments GLEAN3_00791 ###
 fragment
###Gene_Info_Comments GLEAN3_00820 ###
 fragment
###Gene_Info_Comments GLEAN3_00859 ###
 fragment
###Gene_Info_Comments GLEAN3_00901 ###
 fragment
###Gene_Info_Comments GLEAN3_01023 ###
 fragment
###Gene_Info_Comments GLEAN3_01036 ###
 fragment
###Gene_Info_Comments GLEAN3_01064 ###
 fragment
###Gene_Info_Comments GLEAN3_01286 ###
 fragment
###Gene_Info_Comments GLEAN3_01350 ###
 fragment
###Gene_Info_Comments GLEAN3_01364 ###
 fragment
###Gene_Info_Comments GLEAN3_01417 ###
 fragment
###Gene_Info_Comments GLEAN3_01453 ###
 fragment
###Gene_Info_Comments GLEAN3_01766 ###
 fragment, should join with GLEAN3_01767, still incomplete gene
###Gene_Info_Comments GLEAN3_01767 ###
 fragment, should join with GLEAN3_01766, still incomplete gene
###Gene_Info_Comments GLEAN3_01770 ###
 fragment
###Gene_Info_Comments GLEAN3_01828 ###
 partial
###Gene_Info_Comments GLEAN3_01895 ###
 fragment
###Gene_Info_Comments GLEAN3_01910 ###
 fragment
###Gene_Info_Comments GLEAN3_01973 ###
 fragment
###Gene_Info_Comments GLEAN3_02544 ###
 fragment
###Gene_Info_Comments GLEAN3_02741 ###
 partial
###Gene_Info_Comments GLEAN3_02804 ###
 fragment
###Gene_Info_Comments GLEAN3_02907 ###
 fragment
###Gene_Info_Comments GLEAN3_02910 ###
 fragment
###Gene_Info_Comments GLEAN3_02996 ###
 fragment
###Gene_Info_Comments GLEAN3_03311 ###
 fragment
###Gene_Info_Comments GLEAN3_03331 ###
 fragment
###Gene_Info_Comments GLEAN3_03374 ###
 fragment, should join GLEAN3_03375, still incomplete gene
###Gene_Info_Comments GLEAN3_03375 ###
 fragment, should join GLEAN3_03374, still incomplete genes
###Gene_Info_Comments GLEAN3_03431 ###
 fragment
###Gene_Info_Comments GLEAN3_03443 ###
 fragment
###Gene_Info_Comments GLEAN3_03465 ###
 fragment
###Gene_Info_Comments GLEAN3_03510 ###
 fragment
###Gene_Info_Comments GLEAN3_03623 ###
 fragment
###Gene_Info_Comments GLEAN3_03625 ###
 partial; missing C-terminus
###Gene_Info_Comments GLEAN3_03631 ###
 fragment
###Gene_Info_Comments GLEAN3_03663 ###
 fragment
###Gene_Info_Comments GLEAN3_03729 ###
 fragment
###Gene_Info_Comments GLEAN3_03784 ###
 fragment
###Gene_Info_Comments GLEAN3_03886 ###
 partial, missing N- and C-terminus
###Gene_Info_Comments GLEAN3_03977 ###
 fragment
###Gene_Info_Comments GLEAN3_03983 ###
 fragment
###Gene_Info_Comments GLEAN3_04141 ###
 fragment
###Gene_Info_Comments GLEAN3_04142 ###
 fragment
identical to GLEAN3_00281
###Gene_Info_Comments GLEAN3_04226 ###
 fragment
###Gene_Info_Comments GLEAN3_04450 ###
 fragment
###Gene_Info_Comments GLEAN3_04499 ###
 fragment
###Gene_Info_Comments GLEAN3_04567 ###
 fragment
###Gene_Info_Comments GLEAN3_04621 ###
 fragment
###Gene_Info_Comments GLEAN3_04633 ###
 fragment
###Gene_Info_Comments GLEAN3_04708 ###
 fragment
###Gene_Info_Comments GLEAN3_04725 ###
 extra C-terminus
###Gene_Info_Comments GLEAN3_04743 ###
 fragment
###Gene_Info_Comments GLEAN3_04815 ###
 insertions
###Gene_Info_Comments GLEAN3_04816 ###
 insertions
###Gene_Info_Comments GLEAN3_04911 ###
 fragment
###Gene_Info_Comments GLEAN3_04930 ###
 fragment
###Gene_Info_Comments GLEAN3_04935 ###
 fragment
###Gene_Info_Comments GLEAN3_05047 ###
 fragment
###Gene_Info_Comments GLEAN3_05183 ###
 partial, missing C-terminus region
###Gene_Info_Comments GLEAN3_05249 ###
 fragment
###Gene_Info_Comments GLEAN3_05344 ###
 fragment
###Gene_Info_Comments GLEAN3_05357 ###
 fragment
###Gene_Info_Comments GLEAN3_05370 ###
 fragment, has extra residues on C-terminus
###Gene_Info_Comments GLEAN3_05625 ###
 fragment
###Gene_Info_Comments GLEAN3_05717 ###
 fragment
###Gene_Info_Comments GLEAN3_05754 ###
 fragment, missing stretch in middle
###Gene_Info_Comments GLEAN3_05803 ###
 small fragment
###Gene_Info_Comments GLEAN3_05995 ###
 fragment
###Gene_Info_Comments GLEAN3_06257 ###
 fragment
###Gene_Info_Comments GLEAN3_06323 ###
 small fragment
###Gene_Info_Comments GLEAN3_06521 ###
 fragment
###Gene_Info_Comments GLEAN3_06556 ###
 fragment
###Gene_Info_Comments GLEAN3_06698 ###
 small fragment
###Gene_Info_Comments GLEAN3_06779 ###
 fragment
###Gene_Info_Comments GLEAN3_06870 ###
 fragment
###Gene_Info_Comments GLEAN3_06887 ###
 small fragment
###Gene_Info_Comments GLEAN3_07045 ###
 fragment
###Gene_Info_Comments GLEAN3_07321 ###
 fragment, extra N-terminus residues
###Gene_Info_Comments GLEAN3_07657 ###
 fragment
###Gene_Info_Comments GLEAN3_07739 ###
 fragment
###Gene_Info_Comments GLEAN3_07748 ###
 fragment
###Gene_Info_Comments GLEAN3_07853 ###
 fragment
###Gene_Info_Comments GLEAN3_08032 ###
 fragment
###Gene_Info_Comments GLEAN3_08069 ###
 fragment
###Gene_Info_Comments GLEAN3_08130 ###
 fragment
###Gene_Info_Comments GLEAN3_08288 ###
 fragment
###Gene_Info_Comments GLEAN3_08308 ###
 fragment
###Gene_Info_Comments GLEAN3_08320 ###
 fragment
###Gene_Info_Comments GLEAN3_08372 ###
 fragment
###Gene_Info_Comments GLEAN3_08459 ###
 fragment
###Gene_Info_Comments GLEAN3_08685 ###
 fragment
###Gene_Info_Comments GLEAN3_08734 ###
 fragment
###Gene_Info_Comments GLEAN3_08748 ###
 fragment
###Gene_Info_Comments GLEAN3_08759 ###
 fragment
###Gene_Info_Comments GLEAN3_09056 ###
 fragment, extra mismiatch stretch on C-terminus
###Gene_Info_Comments GLEAN3_09158 ###
 fragment
###Gene_Info_Comments GLEAN3_09281 ###
 small fragment
###Gene_Info_Comments GLEAN3_09471 ###
 fragment
###Gene_Info_Comments GLEAN3_09625 ###
 small fragment
###Gene_Info_Comments GLEAN3_09697 ###
 fragment
###Gene_Info_Comments GLEAN3_09733 ###
 fragment
###Gene_Info_Comments GLEAN3_09844 ###
 fragment
###Gene_Info_Comments GLEAN3_10234 ###
 fragment
###Gene_Info_Comments GLEAN3_10289 ###
 fragment
###Gene_Info_Comments GLEAN3_11052 ###
 fragment
###Gene_Info_Comments GLEAN3_11215 ###
 fragment
###Gene_Info_Comments GLEAN3_11390 ###
 partial, missing C-terminus half
###Gene_Info_Comments GLEAN3_11435 ###
 fragment
###Gene_Info_Comments GLEAN3_11587 ###
 fragment
###Gene_Info_Comments GLEAN3_11791 ###
 fragment
###Gene_Info_Comments GLEAN3_12316 ###
 fragment
###Gene_Info_Comments GLEAN3_12600 ###
 partial, missing C-terminus half
###Gene_Info_Comments GLEAN3_12832 ###
 fragment
###Gene_Info_Comments GLEAN3_12941 ###
 fragment
###Gene_Info_Comments GLEAN3_12945 ###
 fragment
###Gene_Info_Comments GLEAN3_12947 ###
 fragment
###Gene_Info_Comments GLEAN3_13036 ###
 fragment
###Gene_Info_Comments GLEAN3_13051 ###
 fragment
###Gene_Info_Comments GLEAN3_13215 ###
 fragment
###Gene_Info_Comments GLEAN3_13473 ###
 fragment
###Gene_Info_Comments GLEAN3_13529 ###
 fragment
###Gene_Info_Comments GLEAN3_13546 ###
 small fragment
###Gene_Info_Comments GLEAN3_13637 ###
 fragment
###Gene_Info_Comments GLEAN3_13678 ###
 fragment
###Gene_Info_Comments GLEAN3_14043 ###
 fragment
###Gene_Info_Comments GLEAN3_14306 ###
 small fragment
###Gene_Info_Comments GLEAN3_14512 ###
 fragment
###Gene_Info_Comments GLEAN3_14648 ###
 fragment
###Gene_Info_Comments GLEAN3_14856 ###
 fragment
###Gene_Info_Comments GLEAN3_15031 ###
 fragment
###Gene_Info_Comments GLEAN3_15060 ###
 fragment
###Gene_Info_Comments GLEAN3_15061 ###
 fragment
###Gene_Info_Comments GLEAN3_15095 ###
 fragment
###Gene_Info_Comments GLEAN3_15100 ###
 fragment
###Gene_Info_Comments GLEAN3_15101 ###
 fragment
###Gene_Info_Comments GLEAN3_15194 ###
 missing N-terminus, extra C-terminus
###Gene_Info_Comments GLEAN3_15476 ###
 partial, missing most of the C-terminus
###Gene_Info_Comments GLEAN3_15492 ###
 fragment
###Gene_Info_Comments GLEAN3_15571 ###
 fragment
###Gene_Info_Comments GLEAN3_16020 ###
 fragment
###Gene_Info_Comments GLEAN3_16176 ###
 partial, missing C-terminus
###Gene_Info_Comments GLEAN3_16633 ###
 fragment
###Gene_Info_Comments GLEAN3_16636 ###
 fragment
###Gene_Info_Comments GLEAN3_16901 ###
 fragment
###Gene_Info_Comments GLEAN3_16917 ###
 small fragment
###Gene_Info_Comments GLEAN3_17139 ###
 fragment, unmatched residues on C-terminus
###Gene_Info_Comments GLEAN3_17143 ###
 fragment
###Gene_Info_Comments GLEAN3_17170 ###
 fragment
###Gene_Info_Comments GLEAN3_17174 ###
 missing N- and C-terminus residues
###Gene_Info_Comments GLEAN3_17608 ###
 fragment
###Gene_Info_Comments GLEAN3_17696 ###
 partial, missing C-terminus half
###Gene_Info_Comments GLEAN3_17703 ###
 fragment
###Gene_Info_Comments GLEAN3_18027 ###
 fragment, unmatched stretch of aminoacids on N-terminus
###Gene_Info_Comments GLEAN3_18083 ###
 partial, missing C-terminus half
###Gene_Info_Comments GLEAN3_18092 ###
 fragment
###Gene_Info_Comments GLEAN3_18766 ###
 missing some aminoacid stretches in middle
###Gene_Info_Comments GLEAN3_18767 ###
 extra stretch of aminoacids in middle
###Gene_Info_Comments GLEAN3_18929 ###
 fragment
###Gene_Info_Comments GLEAN3_19020 ###
 fragment, should join GLEAN3_19027, still missing the N-terminus region
###Gene_Info_Comments GLEAN3_19027 ###
 fragment; should join GLEAN3_19020, still missing the N-terminus region
###Gene_Info_Comments GLEAN3_19298 ###
 fragment
###Gene_Info_Comments GLEAN3_19402 ###
 fragment
###Gene_Info_Comments GLEAN3_19780 ###
 fragment
###Gene_Info_Comments GLEAN3_20014 ###
 fragment
###Gene_Info_Comments GLEAN3_20174 ###
 fragment
###Gene_Info_Comments GLEAN3_22715 ###
 fragment
###Gene_Info_Comments GLEAN3_22752 ###
 fragment
###Gene_Info_Comments GLEAN3_22980 ###
 fragment
###Gene_Info_Comments GLEAN3_22989 ###
 fragment
###Gene_Info_Comments GLEAN3_23129 ###
 small fragment
###Gene_Info_Comments GLEAN3_23227 ###
 small fragment
###Gene_Info_Comments GLEAN3_23266 ###
 small fragment
###Gene_Info_Comments GLEAN3_23323 ###
 fragment
###Gene_Info_Comments GLEAN3_23515 ###
 fragment
###Gene_Info_Comments GLEAN3_23680 ###
 fragment
###Gene_Info_Comments GLEAN3_23772 ###
 small fragment
###Gene_Info_Comments GLEAN3_23841 ###
 small fargment
###Gene_Info_Comments GLEAN3_23849 ###
 fragment
###Gene_Info_Comments GLEAN3_24227 ###
 fragment
###Gene_Info_Comments GLEAN3_24229 ###
 fragment
###Gene_Info_Comments GLEAN3_24313 ###
 fragment
###Gene_Info_Comments GLEAN3_24396 ###
 fragment
###Gene_Info_Comments GLEAN3_24430 ###
 fragment
###Gene_Info_Comments GLEAN3_24541 ###
 fragment
###Gene_Info_Comments GLEAN3_24557 ###
 fragment
###Gene_Info_Comments GLEAN3_25087 ###
 fragment
###Gene_Info_Comments GLEAN3_25224 ###
 fragment
###Gene_Info_Comments GLEAN3_25596 ###
 fragment
###Gene_Info_Comments GLEAN3_26372 ###
 fragment
###Gene_Info_Comments GLEAN3_26682 ###
 fragment
###Gene_Info_Comments GLEAN3_26700 ###
 fragment
###Gene_Info_Comments GLEAN3_26839 ###
 fragment
###Gene_Info_Comments GLEAN3_26988 ###
 fragment
###Gene_Info_Comments GLEAN3_26994 ###
 fragment
###Gene_Info_Comments GLEAN3_27071 ###
 fragment
###Gene_Info_Comments GLEAN3_27495 ###
 fragment
###Gene_Info_Comments GLEAN3_27591 ###
 partial, missing C-terminus half
###Gene_Info_Comments GLEAN3_27901 ###
 fragment
###Gene_Info_Comments GLEAN3_28013 ###
 fragment
###Gene_Info_Comments GLEAN3_28083 ###
 fragment
###Gene_Info_Comments GLEAN3_28129 ###
 fragment
###Gene_Info_Comments GLEAN3_28237 ###
 fragment
###Gene_Info_Comments GLEAN3_28307 ###
 fragment, missing stretch in middle
###Gene_Info_Comments GLEAN3_28454 ###
 fragment
###Gene_Info_Comments GLEAN3_28651 ###
 fragment
###Gene_Info_Comments GLEAN3_28709 ###
 fragment
###Gene_Info_Comments GLEAN3_04266 ###
Annotation entered by Bob Obar (robar@scientist.com).
###Gene_Info_Comments GLEAN3_05500 ###
C1q-related
###Gene_Info_Comments GLEAN3_06578 ###
C1q-related
###Gene_Info_Comments GLEAN3_09020 ###
C1q-related
###Gene_Info_Comments GLEAN3_00433 ###
Single CADH domain-nothing else-obviously a cadherin fragment
###Gene_Info_Comments GLEAN3_01454 ###
21 CADH repeats but no TM or Cyto domain-probably a partial-could it be hitched up to GLEAN3_01452-CLEARLY A CADHERIN BUT CLASS UNCLEAR
###Gene_Info_Comments GLEAN3_02742 ###
11 cadh REPEATS BUT NO TM OR CYTO DOMAIN-CLEARLY A CADHERIN BUT CLASS UNCLEAR
###Gene_Info_Comments GLEAN3_03730 ###
4 CADH + TM AND CYTO BUT NO CAT-BD-NON-CLASSICAL CADHERIN of the vertebrate type
###Gene_Info_Comments GLEAN3_04074 ###
Single CADH domain-nothing else-obviously a cadherin fragment.
###Gene_Info_Comments GLEAN3_04556 ###
2 CADH domains-nothing else-obviously a cadherin fragment.
###Gene_Info_Comments GLEAN3_05228 ###
Single CADH domain followed by a LAMG domain-looks like a fly cadherin
###Gene_Info_Comments GLEAN3_08299 ###
5 CADH domains and a TM domain-probable cadherin fragment of vertebrate type
###Gene_Info_Comments GLEAN3_08380 ###
Single CADH domain and a TM domain-probably a cadherin fragment
###Gene_Info_Comments GLEAN3_09073 ###
12 CADH domains and a set of EGF and LAMG repeats before a TM domain.  Cytoplasmic domain present-but no catenin-binding domain-probable non-classical cadherin of the fly type.
###Gene_Info_Comments GLEAN3_10840 ###
13 CADH domains and EGF/LAMG/EGF_ probable cadherin fragment of the fly type.
###Gene_Info_Comments GLEAN3_11375 ###
8 CADH domains-no TM-probable cadherin fragment.
###Gene_Info_Comments GLEAN3_13323 ###
8 CADH domains-no TM-probable cadherin fragment.
###Gene_Info_Comments GLEAN3_13476 ###
5 CADH repeats-no TM-also N-terminal ANK repeats-possible fragment/possible concatenation
###Gene_Info_Comments GLEAN3_14606 ###
6 CADH repeats and TM-cyto domain present but w/o catenin-binding domain-probable non-classical cadherin of the vertebrate type
###Gene_Info_Comments GLEAN3_15210 ###
6 CADH domains-no TM-probable cadherin fragment
###Gene_Info_Comments GLEAN3_16980 ###
"20 CADH repeats, single EGF and TM-non-classical cadherin of the fly type"
###Gene_Info_Comments GLEAN3_17039 ###
"3 CADH domains, 2 EGF and LAMG-no TM or cytoplasmic domain-probable cadherin fragment of the fly type"
###Gene_Info_Comments GLEAN3_19394 ###
11 CADH repeats and TM-cyto domain present but w/o catenin-binding domain-probable non-classical cadherin
###Gene_Info_Comments GLEAN3_19783 ###
"partial classical cadherin of the fly type, no CADH repeats but has classical cadherin cyto domain"
###Gene_Info_Comments GLEAN3_21086 ###
2 CADH domains and TM-short cytoplasmic domain-probable partial cadherin fragment of vertebrate type
###Gene_Info_Comments GLEAN3_23005 ###
Single CADH domain-nothing else-obviously a cadherin fragment.
###Gene_Info_Comments GLEAN3_25622 ###
7 CADHs plus TM and cytoplasmic domain but no catenin-binding site-probable non-classical cadherin of the vertebrate type-previously reported as homolog of protocadherin 9
###Gene_Info_Comments GLEAN3_26595 ###
Single CADH domain-nothing else-obviously a cadherin fragment
###Gene_Info_Comments GLEAN3_27133 ###
"LOOKS A BIT LIKE FLAMINGO (SEVERAL TM DOMAINS-BUT NO GPS DOMAIN), FLAMINGO/CELSR subfamily"
###Gene_Info_Comments GLEAN3_27356 ###
25 CADH PLUS EGF/LAMG/EGF AND TM-cyto domain present but w/o catenin-binding domain-probable non-classical cadherin of the fly type
###Gene_Info_Comments GLEAN3_28328 ###
3 CADH domains-nothing else-obviously a cadherin fragment.
###Gene_Info_Comments GLEAN3_00958 ###
collagen fragment
###Gene_Info_Comments GLEAN3_04531 ###
collagen fragment
###Gene_Info_Comments GLEAN3_05187 ###
collagen fragment
###Gene_Info_Comments GLEAN3_06067 ###
collagen fragment
###Gene_Info_Comments GLEAN3_07582 ###
collagen fragment
###Gene_Info_Comments GLEAN3_11736 ###
collagen fragment
###Gene_Info_Comments GLEAN3_12707 ###
collagen fragment
###Gene_Info_Comments GLEAN3_13354 ###
collagen fragment
###Gene_Info_Comments GLEAN3_14619 ###
adjacent fragment of a fibrillar collagen
###Gene_Info_Comments GLEAN3_17571 ###
large fragment with a few collagen repeats-could be an N-terminal pro piece
###Gene_Info_Comments GLEAN3_21235 ###
NOVEL COLLAGEN ARCHITECTURE-FUSION??
###Gene_Info_Comments GLEAN3_22116 ###
possible relative of human col24a1
###Gene_Info_Comments GLEAN3_22882 ###
collagen fragment
###Gene_Info_Comments GLEAN3_22896 ###
collagen fragment
###Gene_Info_Comments GLEAN3_22936 ###
possible relative of col3a1
###Gene_Info_Comments GLEAN3_23283 ###
possible relative of col9a3
###Gene_Info_Comments GLEAN3_25369 ###
collagen fragment
###Gene_Info_Comments GLEAN3_26786 ###
collagen fragment
###Gene_Info_Comments GLEAN3_27250 ###
collagen fragment
###Gene_Info_Comments GLEAN3_17872 ###
possible relative of col1a1
###Gene_Info_Comments GLEAN3_13557 ###
"C-terminal fragment of a fibrillar collagen, possible relative of col27a1"
###Gene_Info_Comments GLEAN3_14618 ###
C-terminal fragment of a fibrillar collagen-adjacent gene 14619 contains another fragment
###Gene_Info_Comments GLEAN3_17791 ###
N-terminal fragment of a fibrillar collagen
###Gene_Info_Comments GLEAN3_28613 ###
C terminus of fibrillar collagen
###Gene_Info_Comments GLEAN3_05167 ###
fibrillar collagen of the I/II/III subclass-partial_lacks C-terminus
###Gene_Info_Comments GLEAN3_26008 ###
fibrillar collagen of the I/II/III subclass -appears complete
###Gene_Info_Comments GLEAN3_26009 ###
fibrillar collagen of the I/II/III subclass -appears complete
###Gene_Info_Comments GLEAN3_09076 ###
Fibrillar collagen of the V/XI type-PROBABLY COMPLETE
###Gene_Info_Comments GLEAN3_11016 ###
olfactomedin-related collagen
###Gene_Info_Comments GLEAN3_03768 ###
type IV collagen-could be complete
###Gene_Info_Comments GLEAN3_15708 ###
C-terminal fragment of a type IV collagen
###Gene_Info_Comments GLEAN3_00142 ###
collagen XV/XVIII-partial-lacks N-terminal TSPN/LamG domain
###Gene_Info_Comments GLEAN3_00691 ###
NOVEL ARCHITECTURE - CLECT and TSP1 domains alternating plus FA58C and FTP-"LINK" at C-terminus

- a bit similar to GLEAN3_19437 and GLEAN3_05426
###Gene_Info_Comments GLEAN3_05426 ###
NOVEL ARCHITECTURE - CLECT and TSP1 domains alternating plus EGFCA repeats at N-terminus and FA58C and "LINK"/PANAP at C-terminus

- a bit similar to GLEAN3_19437 and GLEAN3_00691
###Gene_Info_Comments GLEAN3_01768 ###
"PUTATIVE LAM G CHAIN; has LamNT, LamB"
###Gene_Info_Comments GLEAN3_06118 ###
"partial LAM A OR G CHAIN, has LamB"
###Gene_Info_Comments GLEAN3_06558 ###
"Looks complete-has lamB domain-so most like lam g chain has LamNT, LamB"
###Gene_Info_Comments GLEAN3_07555 ###
"partial LAM A OR G CHAIN, has LamB"
###Gene_Info_Comments GLEAN3_09846 ###
"partial LAM A OR G CHAIN, has LamB"
###Gene_Info_Comments GLEAN3_14257 ###
"putative laminin fragment, LamB only, best blast hits are proteoglycan"
###Gene_Info_Comments GLEAN3_15482 ###
Looks complete-has NO lamB domain-so most like lam b chain has LamNT
###Gene_Info_Comments GLEAN3_20192 ###
"PUTATIVE LAM A CHAIN; has LamNT, LamB, missing C-terminus"
###Gene_Info_Comments GLEAN3_22929 ###
"LAM A1/2 CHAIN; has LamNT, LamB, LamG, missing C-terminus"
###Gene_Info_Comments GLEAN3_26039 ###
"LAM A3/5 CHAIN; has LamNT, LamB, LamG, looks complete  "
###Gene_Info_Comments GLEAN3_27389 ###
"putative laminin fragment-lam a or g, has LamB"
###Gene_Info_Comments GLEAN3_13481 ###
has LamNT-looks incomplete-could be a laminin or a netrin
###Gene_Info_Comments GLEAN3_25057 ###
has LamNT-looks incomplete-could be a laminin or a netrin
###Gene_Info_Comments GLEAN3_26322 ###
has LamNT-looks incomplete-could be a laminin or a netrin
###Gene_Info_Comments GLEAN3_01769 ###
has LamNT-Looks incomplete-could be a laminin or a netrin
###Gene_Info_Comments GLEAN3_28368 ###
has LamNT-Looks incomplete-could be a laminin or a netrin
###Gene_Info_Comments GLEAN3_06557 ###
has LamNT-looks incomplete-could be a laminin or a netrin
###Gene_Info_Comments GLEAN3_00176 ###
novel architecture-no known proteins with this composition/organization
###Gene_Info_Comments GLEAN3_01328 ###
"has a reeler domain and an EGF, may belong with reelin genes"
###Gene_Info_Comments GLEAN3_02367 ###
"just a reeler domain, may belong with reelin genes"
###Gene_Info_Comments GLEAN3_07070 ###
novel membrane protein with reeler domain and multiple EGF repeats-no known proteins with this composition/organization
###Gene_Info_Comments GLEAN3_11038 ###
"just a reeler domain, may belong with reelin genes"
###Gene_Info_Comments GLEAN3_12092 ###
"just a reeler domain, may belong with reelin genes"
###Gene_Info_Comments GLEAN3_13071 ###
"just a reeler domain, may belong with reelin genes"
###Gene_Info_Comments GLEAN3_14572 ###
Enormous protein with multiple EGF and EGFCA domains-N-terminal Reeler and a CUB domain near C-terminus. No known proteins with this composition/organization
###Gene_Info_Comments GLEAN3_15603 ###
"has a reeler domain and two EGFs, may belong with reelin genes"
###Gene_Info_Comments GLEAN3_15604 ###
"has a reeler domain and an EGF, may belong with reelin genes"
###Gene_Info_Comments GLEAN3_16091 ###
reeler domain plus DoH catecholamine-binding domain
###Gene_Info_Comments GLEAN3_16188 ###
"has a reeler domain and an EGF, may belong with reelin genes"
###Gene_Info_Comments GLEAN3_16612 ###
"just a reeler domain, may belong with reelin genes"
###Gene_Info_Comments GLEAN3_24087 ###
has a reeler domain but also a RING domain-probably not really related to reeler-novel domain combination
###Gene_Info_Comments GLEAN3_26165 ###
has a reeler domain but also a large block of repetitive simple sequencs and two CCP domains-probably not really related to reeler
###Gene_Info_Comments GLEAN3_26222 ###
"just a reeler domain, may belong with reelin genes-note that adjacent gene very similar"
###Gene_Info_Comments GLEAN3_26223 ###
"just a reeler domain, may belong with reelin genes-not that adjacent gene very similar"
###Gene_Info_Comments GLEAN3_26550 ###
"enormous protein with reeler domain, CUB domain and multiple internal EGF repeats-looks like assembly problem"
###Gene_Info_Comments GLEAN3_13829 ###
large protein with few defined domains-2CCP at one end-SEA/EGF at the other
###Gene_Info_Comments GLEAN3_16171 ###
"SEA, LDLa and CUB domains-novel architecture"
###Gene_Info_Comments GLEAN3_17657 ###
SEA and EGFs
###Gene_Info_Comments GLEAN3_21919 ###
CCP and SEA domains- novel architecture
###Gene_Info_Comments GLEAN3_22753 ###
CCP and SEA domains- novel architecture
###Gene_Info_Comments GLEAN3_27729 ###
SEA and EGF
###Gene_Info_Comments GLEAN3_19795 ###
two sea domains-there is a human protein-interphotoreceptor proteoglycan-with a similar structure
###Gene_Info_Comments GLEAN3_01785 ###
hyalin protein with SEA domain
###Gene_Info_Comments GLEAN3_02355 ###
very large protein with many hyalin repeats and SEA domain and EGF repeats near the C-terminus

expressed during embryonic development
###Gene_Info_Comments GLEAN3_03635 ###
"hyalin repeat protein with some other domains interspersed-SEA, VWD, CUB, LDLa"
###Gene_Info_Comments GLEAN3_04945 ###
SEA/HYR/CLECT - short hyalin protein
###Gene_Info_Comments GLEAN3_09594 ###
has Spond_N and TSP1 domains-realted to vertebrate spondins
###Gene_Info_Comments GLEAN3_20379 ###
has Spond_N and TSP1 domains-realted to vertebrate spondins
###Gene_Info_Comments GLEAN3_13393 ###
Annotation entered by Bob Obar (robar@scientist.com).
The epsilon-tubulin protein family is not yet a coherent one, and it is impossible at this time to determine whether the lack of similarity observed between the amino-terminal ~100 amino acids of this sequence and the corresponding segments of other epsilon-tubulin database entries is due to divergence or error.
###Gene_Info_Comments GLEAN3_03119 ###
See putative conserved domains at:
http://www.ncbi.nlm.nih.gov/Structure/cdd/wrpsb.cgi?RID=1137536744-28277-70510656271.BLASTQ4
###Gene_Info_Comments GLEAN3_02663 ###
Annotation entered by Bob Obar (robar@scientist.com).
This GLEAN originally represented an amino-terminal segment of an epsilon-tubulin encoded by Scaffold498.  The gene model was later manually extended using sequence from Scaffoldi3903, which appears to encode the entire epsilon-tubulin polypeptide.
###Gene_Info_Comments GLEAN3_12141 ###
Annotation entered by Bob Obar (robar@scientist.com).
This GLEAN represents an amino-terminal segment of an epsilon-tubulin.  Except for 6 codons near the amino terminus of each, it is identical to GLEAN3_02663.
###Gene_Info_Comments GLEAN3_00178 ###
See putative conserved domains at:
http://www.ncbi.nlm.nih.gov/Structure/cdd/wrpsb.cgi?RID=1137535187-7613-25495531689.BLASTQ4
###Gene_Info_Comments GLEAN3_22346 ###
Pfam00194 match.

A family of carbonic anhydrase-like proteins exists in the sea urchin. Orthologous relationships with vertebrate carbonic anhydrases and related proteins have not been carefully examined.
###Gene_Info_Comments GLEAN3_26483 ###
Pfam00194 match.

A family of carbonic anhydrase-like proteins exists in the sea urchin. Orthologous relationships with vertebrate carbonic anhydrases and related proteins have not been carefully examined.
###Gene_Info_Comments GLEAN3_16740 ###
Pfam00194 match.  Transcriptome data indicate that it is expressed in the embryo.

A family of carbonic anhydrase-like proteins exists in the sea urchin. Orthologous relationships with vertebrate carbonic anhydrases and related proteins have not been carefully examined.


###Gene_Info_Comments GLEAN3_09509 ###
Pfam00194 match.

A family of carbonic anhydrase-like proteins exists in the sea urchin. Orthologous relationships with vertebrate carbonic anhydrases and related proteins have not been carefully examined.
###Gene_Info_Comments GLEAN3_26747 ###
Pfam00194 match.

A family of carbonic anhydrase-like proteins exists in the sea urchin. Orthologous relationships with vertebrate carbonic anhydrases and related proteins have not been carefully examined.
###Gene_Info_Comments GLEAN3_12995 ###
Pfam00194 match. Transcriptome data indicate it is expressed in the embryo.

A family of carbonic anhydrase-like proteins exists in the sea urchin. Orthologous relationships with vertebrate carbonic anhydrases and related proteins have not been carefully examined.
###Gene_Info_Comments GLEAN3_00702 ###
Pfam00194 match.

A family of carbonic anhydrase-like proteins exists in the sea urchin. Orthologous relationships with vertebrate carbonic anhydrases and related proteins have not been carefully examined.
###Gene_Info_Comments GLEAN3_08894 ###
Pfam00194 match.  Transcriptome data indicate that gene is expressed in the embryo.

A family of carbonic anhydrase-like proteins exists in the sea urchin. Orthologous relationships with vertebrate carbonic anhydrases and related proteins have not been carefully examined.
###Gene_Info_Comments GLEAN3_06053 ###
gene model spans at least 3 glean predictions 
GLEAN3_06053 (N-ter)
GLEAN3_06054 (middle) located on the same scaffold but on opposite strand (assembly problem?)
GLEAN3_24289 (end)

2 glean predictions are duplicates of some exons GLEAN3_06612 (middle) and Glean3_01692 (end)
###Gene_Info_Comments GLEAN3_06054 ###
cf Glean3_06053
###Gene_Info_Comments GLEAN3_24289 ###
cf Glean3_06053
###Gene_Info_Comments GLEAN3_16272 ###
There seems to be some extra predicted exons in the GLEAN prediction. NCBI GNOMON prediction seems to be more accurate (XP_797469)

Best homology with vertebrate Alk6, but as the closely related vertebrate Alk3, doesn't seems to have its counterpart in the sea urchin, I called him Alk3-6.
###Gene_Info_Comments GLEAN3_14830 ###
This model contains exons identical to GLEAN3_28742 and is either a duplication or an allele.
###Gene_Info_Comments GLEAN3_04951 ###
Partial Toll-like receptor. This gene model is located at the end of a short scaffold. The nucleotide sequence has 95% identity to another Sp-Tlr gene, so it could be a member of Toll-like receptor.
###Gene_Info_Comments GLEAN3_12005 ###
Possible Gene Prediction issue - it could be several concatenated proteins
###Gene_Info_Comments GLEAN3_25902 ###
Possible Gene Prediction issue - it could be several concatenated proteins
###Gene_Info_Comments GLEAN3_15305 ###
Unknown architecture-possible gene prediction issue
###Gene_Info_Comments GLEAN3_18488 ###
highly conserved 69% identical to the human protein
###Gene_Info_Comments GLEAN3_10750 ###
trypsin-like protease with SR and several FN3 repeats - novel architecture
###Gene_Info_Comments GLEAN3_00406 ###
Ig domains (9) and a few FN3 - may be a fragment
###Gene_Info_Comments GLEAN3_10007 ###
Ig3/FN3/Ig2 - may be a fragment
###Gene_Info_Comments GLEAN3_10124 ###
Ig3/FN3-2/Ig7 - may be a fragment
###Gene_Info_Comments GLEAN3_10125 ###
Ig13/FN3 - may be a fragment
###Gene_Info_Comments GLEAN3_12992 ###
Igc2-9/FN3-2 - may be a fragment
domain structure  (9-2) is consistent with Ds-CAM (9-4-1-2) missing C-terminus.
Blast match to DCC is fifth after other better matches and domain structure not right for DCC (4-6)- probably not DCC/neogenin but similar molecule
###Gene_Info_Comments GLEAN3_25374 ###
novel architecture -SEMA and FN3
###Gene_Info_Comments GLEAN3_20387 ###
ROS-RELATED -single LY and several FN3 BUT lacks TM and kinase domains
###Gene_Info_Comments GLEAN3_27290 ###
FN3-TM-PPase - TM NOT PREDICTED - overlap with hits is limited
###Gene_Info_Comments GLEAN3_15923 ###
FN3-TM-Ppase - overlap with hits is limited
###Gene_Info_Comments GLEAN3_23943 ###
has Ig - EGF - SEVERAL FN3-GPS-7TM
###Gene_Info_Comments GLEAN3_25788 ###
has  9 FN3s - 4 EGFCas - 3 more FN3-GPS-TM

Apart from lack of 7tm_2 domain this matches to overall pattern of LNB-7TM-GPCRs
May be missing C-terminus
###Gene_Info_Comments GLEAN3_01609 ###
has FN3-GPS-7TM
###Gene_Info_Comments GLEAN3_16363 ###
has mixture of FN3/EGFCa-GPS-7TM_2

matches to overall pattern of LNB-7TM-GPCRs
###Gene_Info_Comments GLEAN3_21832 ###
has CLECT-FN3-GPS-7TM
###Gene_Info_Comments GLEAN3_09018 ###
novel architecture - LIPOXYGENASE (LH2) HOMOLOGIES FLANKING MULTIPLE FN3 REPEATS
###Gene_Info_Comments GLEAN3_21340 ###
LEUCINE-RICH/FN3 PROTEIN -NOVEL ARCHITECTURE - HAS TBC DOMAIN
###Gene_Info_Comments GLEAN3_00435 ###
myosin light chain kinase - structure predicted may suggest duplication or assembly problems
###Gene_Info_Comments GLEAN3_19190 ###
Best matches are with the Rho-GEF kinases, Duet,  but that gene has an additional N-terminal segment - missing RhoGEF, PH and SH3 domains
###Gene_Info_Comments GLEAN3_13917 ###
lots of Ig domains and a few FN3 plus a mixed-function kinase - good matches to projectin and twitchin - clearly a muscle-specifc structural kinase
###Gene_Info_Comments GLEAN3_28470 ###
Kallmann syndrome 1 homolog - WAP/FN3n/LDLa
###Gene_Info_Comments GLEAN3_04282 ###
Ig-FN3-2-BTB
###Gene_Info_Comments GLEAN3_26161 ###
Ig/FN3/Ig/FN3 alternating
###Gene_Info_Comments GLEAN3_00159 ###
hyalin protein - prediction is longer than cDNA sequences
###Gene_Info_Comments GLEAN3_17178 ###
FN3-SPRY - there are a couple of classes of proteins with that combination sometimes plus other domains no good Blast hits domain structure is very similar to GL3_28216 - a good homolog of tripartite motif proteins - suspect this is one of those
###Gene_Info_Comments GLEAN3_28216 ###
FN3-SPRY - best matches are to tripartite motif-contaning proteins from several species
###Gene_Info_Comments GLEAN3_23539 ###
FN3-SPRY - ALSO RING and BBOX - there are several chordate proteins with that combination - probably transcription factors - matches with five best hits do not include the RING domain but the domain structure of the matching segemnt is very similar to GL3_28216 - a good homolog of tripartite motif proteins - suspect this is one of those
###Gene_Info_Comments GLEAN3_27121 ###
"ECM or membrane adhesion protein - probably a fragment - many VWD/EGF/FN3 domains - somewhat homologous with human Fc gamma Ig-binding protein [AAD39266.1] and with zonaadhesins of pig and rabbit [NP_999548.1, AAF63342.2] but there are gaps"
###Gene_Info_Comments GLEAN3_22706 ###
"ECM or membrane adhesion protein - probably a fragment - many VWD and FN3 domains - somewhat homologous with human Fc gamma Ig-binding protein [AAD39266.1] and with human zonaadhesins [AAL04410.1, AAL04412.1, AAL04413.1] - but there are gaps"
###Gene_Info_Comments GLEAN3_03992 ###
CLECT/FN3/TM - MEMBER OF A LINKED CLUSTER OF THREE
###Gene_Info_Comments GLEAN3_03993 ###
CLECT/FN3-2/TM - MEMBER OF A LINKED CLUSTER OF THREE
###Gene_Info_Comments GLEAN3_03994 ###
CLECT-FN3-TM - MEMBER OF A LINKED CLUSTER OF THREE
###Gene_Info_Comments GLEAN3_02394 ###
CLECT-FN3-TM
###Gene_Info_Comments GLEAN3_05115 ###
CLECT-FN3-TM
###Gene_Info_Comments GLEAN3_12215 ###
CLECT-FN3-TM
###Gene_Info_Comments GLEAN3_23488 ###
CLECT-FN3-PANAP-TM -novel architecture
###Gene_Info_Comments GLEAN3_21126 ###
CLECT-FN3- noTM
###Gene_Info_Comments GLEAN3_08066 ###
CLECT-FN3- no TM
###Gene_Info_Comments GLEAN3_23487 ###
CLECT2-FBG-FN3-PANAP-TM -novel architecture
###Gene_Info_Comments GLEAN3_13668 ###
CLECT2-EGF3-FN3-TM
###Gene_Info_Comments GLEAN3_14375 ###
CLECT2-EGF3-FN3-TM
###Gene_Info_Comments GLEAN3_24360 ###
CLECT-EGF-FN3-TM
###Gene_Info_Comments GLEAN3_28703 ###
CLECT-EGF3-FN3-TM
###Gene_Info_Comments GLEAN3_19135 ###
CLECT-EGF2-FN3 - noTM
###Gene_Info_Comments GLEAN3_12463 ###
CLECT-EGF2-FN3 - no TM
###Gene_Info_Comments GLEAN3_00346 ###
CLECT/EGF2/FN3/TM - ALSO POSSIBLE PHOSPHATASE although PPase domain looks to be outside??
###Gene_Info_Comments GLEAN3_12855 ###
CLECT/FN3 alternating WITH BLOCK OF CUB DOMAINS IN MIDDLE -novel architecture
###Gene_Info_Comments GLEAN3_10216 ###
CLECT/FN3 alternating
###Gene_Info_Comments GLEAN3_27967 ###
ANK2/FN3/RA RA domains - looks like some sort of intracellular adaptor
###Gene_Info_Comments GLEAN3_18475 ###
Ig/FN3/TM - may be fragment - see adjacent gene
###Gene_Info_Comments GLEAN3_27951 ###
Ig3/FN3-2/TM - may be fragment - see adjacent gene
###Gene_Info_Comments GLEAN3_05899 ###
Ig2/FN3/TM - fragment
###Gene_Info_Comments GLEAN3_06614 ###
Ig2/FN3-3/TM -N-term half matches with Robo - Cterm half does not
###Gene_Info_Comments GLEAN3_08663 ###
Ig-4/FN3-4/TM -N-term half matches with Robo - Cterm half does not
###Gene_Info_Comments GLEAN3_10851 ###
Ig5/FN3-2/TM -looks like quite a good domain match for CDO
###Gene_Info_Comments GLEAN3_12530 ###
Ig/FN3/TM-fragment
###Gene_Info_Comments GLEAN3_15844 ###
Ig4/FN3/Ig/TM
###Gene_Info_Comments GLEAN3_15846 ###
Ig2/FN3/TM - fragment
###Gene_Info_Comments GLEAN3_24482 ###
Ig2/FN3/TM - fragment
###Gene_Info_Comments GLEAN3_24708 ###
Ig4/FN3-2/TM
###Gene_Info_Comments GLEAN3_03460 ###
Ig/FN3-3/Ig/FN3-2 /TM -may be fragment
###Gene_Info_Comments GLEAN3_09818 ###
Igc2-4/FN3-4/Igc2/FN3-2/TM

this does look like a Ds-CAM homolog - Ds-CAM has 9-4-1-2 arrangement of Ig/FN3 domains - this has 4-4-1-2 - suggests its missing the N-terminal 5 Ig domains

NB - there are quite a few genes with 5 Ig repeats and nothing else - that might comprise the N-terminus
(GLEAN3_12351, GLEAN3_15273, GLEAN3_08772, GLEAN3_07532, GLEAN3_10002, GLEAN3_13487, GLEAN3_16889, GLEAN3_00469, GLEAN3_25431, GLEAN3_25430, GLEAN3_15584, GLEAN3_09608, GLEAN3_09024)
None of these is obviously adjacent - nor are any other Ig only genes from the numbering - needs browser work.
###Gene_Info_Comments GLEAN3_09318 ###
Ig6/FN3/TM -partial overlap only
###Gene_Info_Comments GLEAN3_15705 ###
Ig2/FN3/TM - MAY BE fragment
###Gene_Info_Comments GLEAN3_17022 ###
IG5/FN3/TM
###Gene_Info_Comments GLEAN3_18030 ###
Ig2/FN3/TM - may be a fragment
###Gene_Info_Comments GLEAN3_21725 ###
Ig5/FN3/TM - may be a fragment
###Gene_Info_Comments GLEAN3_24019 ###
Ig8/FN3/TM - may be a fragment
###Gene_Info_Comments GLEAN3_28401 ###
Ig5/FN3/TM - may be a fragment
###Gene_Info_Comments GLEAN3_25975 ###
Ig3/FN6/TM - partial overlap only - does not have predicted TM and C-terminus (putative cyto domain) does not have Neogenin_C
and is not homologous with DCC in Blast

Domain sequence (3-6) is consistent with 4-6 arrangement in DCC/neogenin - would suggest it's missing N-terminal Ig domain 
###Gene_Info_Comments GLEAN3_26725 ###
Ig4/FN3-2/TM
###Gene_Info_Comments GLEAN3_27415 ###
Ig2/FN3-3/TM
###Gene_Info_Comments GLEAN3_25966 ###
LRR repeats-Ig-EGF-FN3-TM
###Gene_Info_Comments GLEAN3_01538 ###
FN3/TM - probably a fragment
###Gene_Info_Comments GLEAN3_01957 ###
FN3/TM - probably a fragment
###Gene_Info_Comments GLEAN3_04072 ###
FN3-2/TM - probably a fragment
###Gene_Info_Comments GLEAN3_04128 ###
FN3-2/TM - probably a fragment
###Gene_Info_Comments GLEAN3_13954 ###
FN3/TM - probably a fragment
###Gene_Info_Comments GLEAN3_19460 ###
FN3/TM - probably a fragment
###Gene_Info_Comments GLEAN3_20590 ###
FN3-2/TM - probably a fragment
###Gene_Info_Comments GLEAN3_22063 ###
FN3-2/TM - probably a fragment
###Gene_Info_Comments GLEAN3_02657 ###
FN3-3/TM - probably a fragment
###Gene_Info_Comments GLEAN3_02763 ###
EGF3FN3-3/TM - no Ig (?) - probably missing C-terminus
###Gene_Info_Comments GLEAN3_04758 ###
FN3-2/TM - probably a fragment
###Gene_Info_Comments GLEAN3_07789 ###
FN3-3/TM - probably a fragment
###Gene_Info_Comments GLEAN3_11473 ###
FN3-8/TM
###Gene_Info_Comments GLEAN3_12998 ###
FN3-6/TM
###Gene_Info_Comments GLEAN3_15746 ###
FN3-5/TM
###Gene_Info_Comments GLEAN3_16936 ###
FN3-31/TM
###Gene_Info_Comments GLEAN3_17208 ###
FN3-14/TM
###Gene_Info_Comments GLEAN3_18070 ###
FN3-4/TM- probably a fragment
###Gene_Info_Comments GLEAN3_20335 ###
FN3-3/TM - probably a fragment
###Gene_Info_Comments GLEAN3_21244 ###
FN3-3/TM - probably a fragment
###Gene_Info_Comments GLEAN3_22585 ###
FN3-4/TM- probably a fragment
###Gene_Info_Comments GLEAN3_26349 ###
FN3-4/TM- probably a fragment
###Gene_Info_Comments GLEAN3_26991 ###
FN3-3/TM - probably a fragment
###Gene_Info_Comments GLEAN3_22586 ###
EGF/Ig/FN3-2/TM - probably missing both ends
###Gene_Info_Comments GLEAN3_19051 ###
EGF/FN3-2/TM - see adjacent gene
###Gene_Info_Comments GLEAN3_27686 ###
EGF/FN3/TM - see adjacent gene - missing kinase
###Gene_Info_Comments GLEAN3_27687 ###
EGF/FN3-3/TM - see adjacent gene- missing kinase
###Gene_Info_Comments GLEAN3_00673 ###
EGF5/FN3-3/TM - probably fragment
###Gene_Info_Comments GLEAN3_02870 ###
EGF-3/FN3-2/TM - missing kinase
###Gene_Info_Comments GLEAN3_07983 ###
EGF2/FN3/TM - missing kinase
###Gene_Info_Comments GLEAN3_10587 ###
EGF/FN3/TM
###Gene_Info_Comments GLEAN3_11180 ###
EGF/FN3/TM
###Gene_Info_Comments GLEAN3_12810 ###
EGF/FN3-2/TM
###Gene_Info_Comments GLEAN3_13172 ###
EGF2/FN3/TM -partial overlap only - missing kinase
###Gene_Info_Comments GLEAN3_18153 ###
EGF2/FN3/TM- partial overlap only
###Gene_Info_Comments GLEAN3_23923 ###
EGF-2/FN3-3/TM - partial overlap only
###Gene_Info_Comments GLEAN3_04347 ###
EGF-2/FN3-3/TM - very partial overlap
###Gene_Info_Comments GLEAN3_09654 ###
EGF-3/FN3-2/TM - very partial overlap
###Gene_Info_Comments GLEAN3_11804 ###
EGF/FN3-3/TM - very partial overlap
###Gene_Info_Comments GLEAN3_14858 ###
EGF-2/FN3-3/TM -partial overlap - missing kinase
###Gene_Info_Comments GLEAN3_15381 ###
EGF/FN3-4/TM - partial overlap
###Gene_Info_Comments GLEAN3_16530 ###
EGF-2/FN3-3/TM - very partial overlap
###Gene_Info_Comments GLEAN3_21253 ###
adhesion receptor - FN3/EGF_Ca intermingled - single CCP - a bit similar in composition to FLJ00133 protein but domain organisation different - see adjacent gene - pfam gives some hyalin repeats
###Gene_Info_Comments GLEAN3_23398 ###
two FN3 - could be ECM or receptor - see adjacent genes
###Gene_Info_Comments GLEAN3_23400 ###
two FN3 - could be ECM or receptor - see adjacent genes
###Gene_Info_Comments GLEAN3_22518 ###
two FN3 - could be ECM or receptor - see adjacent gene
###Gene_Info_Comments GLEAN3_02470 ###
two FN3 - could be ECM or receptor
###Gene_Info_Comments GLEAN3_02531 ###
two FN3 - could be ECM or receptor
###Gene_Info_Comments GLEAN3_02801 ###
two FN3 - could be ECM or receptor
###Gene_Info_Comments GLEAN3_03777 ###
two FN3 - could be ECM or receptor
###Gene_Info_Comments GLEAN3_04756 ###
two FN3 - could be ECM or receptor
###Gene_Info_Comments GLEAN3_06052 ###
two FN3 - could be ECM or receptor
###Gene_Info_Comments GLEAN3_11337 ###
two FN3 - could be ECM or receptor
###Gene_Info_Comments GLEAN3_16143 ###
two FN3 - could be ECM or receptor
###Gene_Info_Comments GLEAN3_17760 ###
two FN3 - could be ECM or receptor
###Gene_Info_Comments GLEAN3_17979 ###
two FN3 - could be ECM or receptor
###Gene_Info_Comments GLEAN3_21019 ###
two FN3 - could be ECM or receptor
###Gene_Info_Comments GLEAN3_21268 ###
two FN3 - could be ECM or receptor
###Gene_Info_Comments GLEAN3_22616 ###
two FN3 - could be ECM or receptor
###Gene_Info_Comments GLEAN3_22764 ###
two FN3 - could be ECM or receptor
###Gene_Info_Comments GLEAN3_23304 ###
two FN3 - could be ECM or receptor
###Gene_Info_Comments GLEAN3_23502 ###
two FN3 - could be ECM or receptor
###Gene_Info_Comments GLEAN3_22519 ###
three FN3 - could be ECM or receptor - see adjacent gene
###Gene_Info_Comments GLEAN3_03995 ###
three FN3 - could be ECM or receptor
###Gene_Info_Comments GLEAN3_04950 ###
three FN3 - could be ECM or receptor
###Gene_Info_Comments GLEAN3_11397 ###
three FN3 - could be ECM or receptor
###Gene_Info_Comments GLEAN3_11648 ###
three FN3 - could be ECM or receptor
###Gene_Info_Comments GLEAN3_13667 ###
three FN3 - could be ECM or receptor
###Gene_Info_Comments GLEAN3_17929 ###
three FN3 - could be ECM or receptor
###Gene_Info_Comments GLEAN3_19818 ###
three FN3 - could be ECM or receptor
###Gene_Info_Comments GLEAN3_22314 ###
three FN3 - could be ECM or receptor
###Gene_Info_Comments GLEAN3_24061 ###
three FN3 - could be ECM or receptor
###Gene_Info_Comments GLEAN3_25664 ###
three FN3 - could be ECM or receptor
###Gene_Info_Comments GLEAN3_28131 ###
three FN3 - could be ECM or receptor
###Gene_Info_Comments GLEAN3_28665 ###
three FN3 - could be ECM or receptor
###Gene_Info_Comments GLEAN3_23737 ###
Ig2/FN3-3 - could be ECM or receptor - see adjacent gene
###Gene_Info_Comments GLEAN3_22705 ###
four FN3 - could be ECM or receptor - see adjacent gene
###Gene_Info_Comments GLEAN3_23738 ###
four FN3 - could be ECM or receptor - see adjacent gene
###Gene_Info_Comments GLEAN3_22580 ###
four FN3 - could be ECM or receptor
###Gene_Info_Comments GLEAN3_26676 ###
four FN3 - could be ECM or receptor
###Gene_Info_Comments GLEAN3_22535 ###
five FN3 - could be ECM or receptor
###Gene_Info_Comments GLEAN3_04309 ###
five FN3 - could be ECM or receptor
###Gene_Info_Comments GLEAN3_14850 ###
four FN3 - could be ECM or receptor
###Gene_Info_Comments GLEAN3_17765 ###
five FN3 - could be ECM or receptor
###Gene_Info_Comments GLEAN3_21252 ###
12 FN3 - could be ECM or receptor
###Gene_Info_Comments GLEAN3_18476 ###
three FN3 - could be ECM or receptor - see adjacent gene
###Gene_Info_Comments GLEAN3_21082 ###
three FN3 - could be ECM or receptor - see adjacent gene
###Gene_Info_Comments GLEAN3_17766 ###
three FN3 - could be ECM or receptor - last of several similar adjacent fragments
###Gene_Info_Comments GLEAN3_15008 ###
lots of EGF - a few FN3 repeats - one CUB domain - Blast hits with Notch but does not really look like Notch
###Gene_Info_Comments GLEAN3_00454 ###
Ig/FN3-4 - weak match with Nr-CAM
###Gene_Info_Comments GLEAN3_00554 ###
Ig4/FN3-2 - weak match with NCAM
###Gene_Info_Comments GLEAN3_01159 ###
EGF/FN3-2
###Gene_Info_Comments GLEAN3_05508 ###
Ig6/DISIN/FN3-2/IG3/FN3-4 - patchy match with contactin
###Gene_Info_Comments GLEAN3_09039 ###
Ig/FN3-5 - weak match with neogenein
###Gene_Info_Comments GLEAN3_09757 ###
Ig5/FN3-4 - no TM - incomplete?
###Gene_Info_Comments GLEAN3_13497 ###
Ig/FN3-3
###Gene_Info_Comments GLEAN3_13927 ###
Ig/FN3-3
###Gene_Info_Comments GLEAN3_26307 ###
Ig4/FN3-2
###Gene_Info_Comments GLEAN3_28098 ###
Ig/FN3-4
###Gene_Info_Comments GLEAN3_28100 ###
Ig/FN3-5
###Gene_Info_Comments GLEAN3_24882 ###
Ig3/FN3  - see adjacent gene
###Gene_Info_Comments GLEAN3_27952 ###
Ig4/FN3  - see adjacent gene
###Gene_Info_Comments GLEAN3_04501 ###
Ig5/FN3- weak match with nephrin
###Gene_Info_Comments GLEAN3_04746 ###
Ig3/FN3 - weak match with NCAM
###Gene_Info_Comments GLEAN3_05900 ###
Ig3/FN3 - weak match with nephrin
###Gene_Info_Comments GLEAN3_06087 ###
Ig4/FN3
###Gene_Info_Comments GLEAN3_07323 ###
Ig4/FN3
###Gene_Info_Comments GLEAN3_08387 ###
Ig/FN3
###Gene_Info_Comments GLEAN3_08771 ###
Ig2/FN3
###Gene_Info_Comments GLEAN3_09571 ###
Ig/FN3
###Gene_Info_Comments GLEAN3_10291 ###
Ig2/FN3-2
###Gene_Info_Comments GLEAN3_14759 ###
Ig2/FN3
###Gene_Info_Comments GLEAN3_17488 ###
Ig5/FN3
###Gene_Info_Comments GLEAN3_17889 ###
Ig/FN3-2
###Gene_Info_Comments GLEAN3_21745 ###
Ig3/FN3
###Gene_Info_Comments GLEAN3_23843 ###
Ig2/FN3 - weak match with sidekick
###Gene_Info_Comments GLEAN3_24986 ###
Ig/FN3-2
###Gene_Info_Comments GLEAN3_26399 ###
EGF/Ig/EGF3/FN3-4
###Gene_Info_Comments GLEAN3_01092 ###
Ig4/FN3
###Gene_Info_Comments GLEAN3_25433 ###
Ig8/FN3
###Gene_Info_Comments GLEAN3_21083 ###
four FN3 - could be ECM or receptor - see adjacent gene
###Gene_Info_Comments GLEAN3_22641 ###
four FN3 - could be ECM or receptor
###Gene_Info_Comments GLEAN3_21262 ###
FN3-10- one LH
###Gene_Info_Comments GLEAN3_15007 ###
FN3-3/EGF/FN3-7/EGF2 - possible TENASCIN - BUT NO FBG
###Gene_Info_Comments GLEAN3_01682 ###
FN3 domain
###Gene_Info_Comments GLEAN3_02469 ###
FN3 domain
###Gene_Info_Comments GLEAN3_09658 ###
FN3 domain
###Gene_Info_Comments GLEAN3_14031 ###
FN3 domain
###Gene_Info_Comments GLEAN3_14854 ###
FN3 domain
###Gene_Info_Comments GLEAN3_19478 ###
FN3 domain
###Gene_Info_Comments GLEAN3_20676 ###
FN3 domain
###Gene_Info_Comments GLEAN3_21011 ###
FN3 domain
###Gene_Info_Comments GLEAN3_25558 ###
FN3 domain
###Gene_Info_Comments GLEAN3_27203 ###
FN3 domain
###Gene_Info_Comments GLEAN3_02112 ###
FN3-18/EGF12 - possible TENASCIN - BUT no FBG
###Gene_Info_Comments GLEAN3_03659 ###
FN3-7/EGF2 - conceivably TENASCIN - BUT NO FBG
###Gene_Info_Comments GLEAN3_01994 ###
EGF/Ig/FN3
###Gene_Info_Comments GLEAN3_03814 ###
EGF/Ig/FN3
###Gene_Info_Comments GLEAN3_04830 ###
Ig/EGF2/FN3-4
###Gene_Info_Comments GLEAN3_08166 ###
EGF5/Ig5/FN3-2
###Gene_Info_Comments GLEAN3_16493 ###
Ig/EGF-2/FN3-2
###Gene_Info_Comments GLEAN3_23397 ###
FN3/EGF/FN3-2 - see adjacent genes
###Gene_Info_Comments GLEAN3_19052 ###
EGF/FN3 - see adjacent gene
###Gene_Info_Comments GLEAN3_24881 ###
EGF/FN3  - see adjacent gene
###Gene_Info_Comments GLEAN3_01907 ###
EGF3/FN3-3
###Gene_Info_Comments GLEAN3_03293 ###
EGF/FN3
###Gene_Info_Comments GLEAN3_03931 ###
EGF/FN3
###Gene_Info_Comments GLEAN3_13220 ###
EGF/FN3-3
###Gene_Info_Comments GLEAN3_19659 ###
EGF/FN3
###Gene_Info_Comments GLEAN3_21589 ###
EGF/FN3
###Gene_Info_Comments GLEAN3_25304 ###
EGF7/FN3
###Gene_Info_Comments GLEAN3_26416 ###
EGF2/FN3
###Gene_Info_Comments GLEAN3_03418 ###
EGF/FN3-2
###Gene_Info_Comments GLEAN3_04018 ###
EGF/FN3-3/EGF
###Gene_Info_Comments GLEAN3_05820 ###
EGF2/FN3
###Gene_Info_Comments GLEAN3_06514 ###
EGF/FN3
###Gene_Info_Comments GLEAN3_07791 ###
EGF2/FN3
###Gene_Info_Comments GLEAN3_09637 ###
EGF/FN3-3
###Gene_Info_Comments GLEAN3_11344 ###
EGF/FN3
###Gene_Info_Comments GLEAN3_23165 ###
EGF2/FN3-3
###Gene_Info_Comments GLEAN3_01463 ###
all FN3 - could be ECM or receptor - see adjacent gene
###Gene_Info_Comments GLEAN3_24778 ###
all FN3 - could be ECM or receptor - see adjacent gene
###Gene_Info_Comments GLEAN3_24777 ###
all FN3 - could be ECM or receptor  - see adjacent gene
###Gene_Info_Comments GLEAN3_00736 ###
all FN3 - could be ECM or receptor
###Gene_Info_Comments GLEAN3_00748 ###
all FN3 - could be ECM or receptor
###Gene_Info_Comments GLEAN3_00797 ###
all FN3 - could be ECM or receptor
###Gene_Info_Comments GLEAN3_00925 ###
all FN3 - could be ECM or receptor
###Gene_Info_Comments GLEAN3_01610 ###
all FN3 - could be ECM or receptor
###Gene_Info_Comments GLEAN3_01658 ###
all FN3 - could be ECM or receptor
###Gene_Info_Comments GLEAN3_02802 ###
all FN3 - could be ECM or receptor
###Gene_Info_Comments GLEAN3_02820 ###
all FN3 - could be ECM or receptor
###Gene_Info_Comments GLEAN3_03086 ###
all FN3 - could be ECM or receptor
###Gene_Info_Comments GLEAN3_03583 ###
all FN3 - could be ECM or receptor
###Gene_Info_Comments GLEAN3_05060 ###
all FN3 - could be ECM or receptor
###Gene_Info_Comments GLEAN3_05960 ###
all FN3 - could be ECM or receptor
###Gene_Info_Comments GLEAN3_07364 ###
all FN3 - could be ECM or receptor
###Gene_Info_Comments GLEAN3_09767 ###
all FN3 - could be ECM or receptor
###Gene_Info_Comments GLEAN3_09943 ###
all FN3 - could be ECM or receptor
###Gene_Info_Comments GLEAN3_10144 ###
all FN3 - could be ECM or receptor
###Gene_Info_Comments GLEAN3_10788 ###
all FN3 - could be ECM or receptor
###Gene_Info_Comments GLEAN3_11113 ###
all FN3 - could be ECM or receptor
###Gene_Info_Comments GLEAN3_13800 ###
all FN3 - could be ECM or receptor
###Gene_Info_Comments GLEAN3_14469 ###
all FN3 - could be ECM or receptor
###Gene_Info_Comments GLEAN3_15918 ###
all FN3 - could be ECM or receptor
###Gene_Info_Comments GLEAN3_15919 ###
all FN3 - could be ECM or receptor
###Gene_Info_Comments GLEAN3_17207 ###
all FN3 - could be ECM or receptor
###Gene_Info_Comments GLEAN3_17762 ###
all FN3 - could be ECM or receptor
###Gene_Info_Comments GLEAN3_17937 ###
all FN3 - could be ECM or receptor
###Gene_Info_Comments GLEAN3_17938 ###
all FN3 - could be ECM or receptor
###Gene_Info_Comments GLEAN3_28716 ###
all FN3 - could be ECM or receptor
###Gene_Info_Comments GLEAN3_10429 ###
VWC AND FN3 DOMAINS INTERMINGLED - NO TM PREDICTED BUT THERE ARE TM RECPTORS KNOWN WITH THESE TWO DOMAINS
###Gene_Info_Comments GLEAN3_09576 ###
all Ig - could be ECM or receptor
###Gene_Info_Comments GLEAN3_01222 ###
all FN3 - could be ECM or receptor
###Gene_Info_Comments GLEAN3_05015 ###
all FN3 - could be ECM or receptor
###Gene_Info_Comments GLEAN3_26321 ###
all FN3 - could be ECM or receptor
###Gene_Info_Comments GLEAN3_00997 ###
This is a second C3 gene in the sea urchin.  The encoded protein has a conserved thioester site and a single cleavage site to generate alpha and beta chains.  There is a histidine in the C-terminal direction that functions in substrate binding choice.

The gene model is missing the 5' end - about the first 130 amino acids.  However, Genboree shows an overlap with NCBI:prediction XM-775838.1 (scaffold 1499) that may contain the missing part of the gene.
###Gene_Info_Comments GLEAN3_05193 ###
The encoded protein has a thioester site and two cleavage sites.  The first to cleave the alpha and beta chains and the second to cleave the alpha and gamma chains.  This structure is typical of C4 proteins in mammals but also the C3 proteins in the cyclostomes.  
###Gene_Info_Comments GLEAN3_28445 ###
likely missing two exons on the C-terminus.
###Gene_Info_Comments GLEAN3_22988 ###
The encoded protein has a thioester site and histidine in the N terminal direction that may function in target choice.  The sequence has a cleavage site between the alpha and beta chains, however, the beta chain is very short and the alpha chain is long.

GLEAN3-22988 overlaps with GLEAN3-19601.  See alignment with this annotation.  It is not clear why there is an overlap, but it may be an assembly problem.

The sequence overlaps with GLEAN3_19601
###Gene_Info_Comments GLEAN3_21668 ###
Annotation entered by Bob Obar (robar@scientist.com).
This is one of 4 tandem alpha-tubulin Gene Models (GLEAN3_21667 - 21770).
###Gene_Info_Comments GLEAN3_21669 ###
Annotation entered by Bob Obar (robar@scientist.com).
This is one of 4 tandem alpha-tubulin Gene Models (GLEAN3_21667 - 21770).
###Gene_Info_Comments GLEAN3_16746 ###
Annotation entered by Bob Obar (robar@scientist.com).
###Gene_Info_Comments GLEAN3_28221 ###
Annotation entered by Bob Obar (robar@scientist.com).
###Gene_Info_Comments GLEAN3_24615 ###
Annotation entered by Bob Obar (robar@scientist.com).  This Gene Model contains a full-length alpha-tubulin with a duplication of 125 amino acids near the amino terminus of the predicted protein.
###Gene_Info_Comments GLEAN3_12679 ###
Annotation entered by Bob Obar (robar@scientist.com).  This Gene Model contains a nearly full-length alpha-tubulin that is missing 15 amino acids at the amino terminus of the predicted protein.
###Gene_Info_Comments GLEAN3_27848 ###
This is the N terminal part of the protein; the C terminal portion is encoded by GLEAN3_26498. These 2 gleans overlap (nucleotide level): bases 724-1066 (this  glean).  Probably the sequence after 1066 {TAGGATTATTGAGAAATCTTTAA} probably does not really belong to this gene
###Gene_Info_Comments GLEAN3_27380 ###
3' of CDS missing (TM domain)
###Gene_Info_Comments GLEAN3_05205 ###
similar to Phosphatidyl Serine Receptor (PSR), involved in phosphatidylserine-specific apoptotic cell clearance (e.g. macrophages engulfing apoptotic T cells)

-model assembled/modified from GLEAN3_05204 and GLEAN3_05205
###Gene_Info_Comments GLEAN3_05885 ###
Partial sequence.
See putative conserved domains at:
http://www.ncbi.nlm.nih.gov/Structure/cdd/wrpsb.cgi?RID=1137537340-17614-146655645239.BLASTQ1
###Gene_Info_Comments GLEAN3_16937 ###
The 5' end of the sequence annotated here is GLEAN3_16936.  GLEAN3_16936 sequences have been pasted in front of GLEAN3_16937 sequences.  A portion of GLEAN3_16936 at the boundaries of contig AAGJ01178647 and contig AAJ01178648 contains an identical repeat.  This is probably a genome assembly error and the duplicate sequence has been removed from the peptide sequence reported here (but not from the DNA sequence).   The deleted sequence in GLEAN3_16936 is:
ITTGLYNDEMVTSSTTRNCSTTDCESFTVDFDTLNSGTLYTLYAGVVQSSGREVVPLLAKAATIPESAVDLQFTSIGRNYVVLTWDNPAGMIDSYNISYYPVNDITKLMFEVVQAAAESNVLRVDDLNEGMNYSFTVVSLLEVEADLQEMGAPVEVFAVVGVLGSLNITAFDETTMNIEWEQVDVED.
###Gene_Info_Comments GLEAN3_05602 ###
Partial sequence.
See putative conserved domains at:
http://www.ncbi.nlm.nih.gov/Structure/cdd/wrpsb.cgi?RID=1137537204-5277-205811584566.BLASTQ4
###Gene_Info_Comments GLEAN3_12073 ###
Partial sequence.
###Gene_Info_Comments GLEAN3_06699 ###
Similarity to Chlamydomonas reinhardtii Inner_Dynein_Arm_1_Intermediate Chain IC140 (C_530081|166736|IA1-IC140|) and Homo sapiens axonemal dynein intermediate polypeptide 2.  Contains a WD-40 motifs.  The protein is also essentially identical to the Anthocidaris crassispina (gi|2494216|sp|Q16960|DYI3_ANTCR Dynein intermediate chain 3, ciliary).
###Gene_Info_Comments GLEAN3_12809 ###
Similarity to Chlamydomonas reinhardtii Inner_Dynein_Arm_1_Intermediate Chain IC140 (C_530081|166736|IA1-IC140|) and Homo sapiens testis development protein NYD-SP29 (NP_660155).
###Gene_Info_Comments GLEAN3_08502 ###
Similar to protein serine/threonine phosphatase 4 regulatory subunit 1.
###Gene_Info_Comments GLEAN3_26101 ###
Similar to PP4R1
###Gene_Info_Comments GLEAN3_15320 ###
This Gene Model is a close homolog of Chlamydomonas reinhardtii Inner Dynein Arm Light Chain p28 (IA-IC28, C_740003) and Homo sapiens axonemal dynein light chain (NP_003453).  It has a GenBank ID gi|1354084|gb|AAC47111.1| (axonemal dynein light chain p33).
###Gene_Info_Comments GLEAN3_18432 ###
Similar to Dentin sialophosphoprotein precursor (DMP-3).  Partial sequence.
###Gene_Info_Comments GLEAN3_17261 ###
Similar to Neurabin 1
###Gene_Info_Comments GLEAN3_10093 ###
this piece does not match well with any of the perlecan gene segments - so it is not from there.
###Gene_Info_Comments GLEAN3_00937 ###
This looks like the N-terminus of perlecan - with other parts of the gene in different gene predictions.

GLEAN3_12324/GLEAN3_28620 BOTH look like the middle of the gene and they are duplications - GLEAN3_12324 is longer  

GLEAN3_26338 looks like the C-terminus
###Gene_Info_Comments GLEAN3_00974 ###
This gene matches predicted gene products for coronin 1A  
(gi|72123047|ref|XP_791382.1| PREDICTED: similar to coronin, actin binding protein, 1A [Strongylocentrotus purpuratus])
###Gene_Info_Comments GLEAN3_02812 ###
EGF,EGF_Lam and FAS1 domains and a TM domain
Closest match in human is stabilin but the urchin gene lacks the LINK domain and has a somewhat different pattern of EGF/FAS1 domains
###Gene_Info_Comments GLEAN3_15419 ###
GLEAN3_15419 model is part of predicted Sp-Pask. Other, non-overlapping part is in GLEAN3_16604 on scaffold 70175.  FgeneshAB prediction S.P_Scaffold70175 may have additional exons, based on alignment with mammalian PAS-K.
###Gene_Info_Comments GLEAN3_03676 ###
two Fas1 domains - member of a family of genes with similar structures and sequences

Note that GLEAN3_03678 has same structure
###Gene_Info_Comments GLEAN3_03678 ###
two FAS1 domains - member of a family of genes with similar structures and sequences

Virtually identical to GLEAN3_20485 - latter has an additional  short segment in the middle(Exon??)

Also virtually identical to GLEAN3_03676 - but there are some minor sequence differences
###Gene_Info_Comments GLEAN3_20485 ###
two FAS1 domains - probably a fragment

Virtually identical to GLEAN3_03678 - latter is missing a short segment in the middle 
###Gene_Info_Comments GLEAN3_05198 ###
two FAS1 domains

member of a family of genes with similar structures and sequences
###Gene_Info_Comments GLEAN3_06345 ###
two Fas1 domains - member of a family of genes with similar structures and sequences

Note that GLEAN3_06346 has same structure and virtually same sequence
###Gene_Info_Comments GLEAN3_06346 ###
two Fas1 domains - member of a family of genes with similar structures and sequences

Note that GLEAN3_06345 has same structure and virtually same sequence
###Gene_Info_Comments GLEAN3_15670 ###
two Fas1 domains - member of a family of genes with similar structures and sequences
###Gene_Info_Comments GLEAN3_00041 ###
enormous protein with three VWD/TIL, three VWC, lots of segments of low complexity and a CT at C-terminus

general domain composition looks mucin-like
###Gene_Info_Comments GLEAN3_00070 ###
NIDO, AMOP, VWD,CCP - LOOKS LIKE A MUCIN
###Gene_Info_Comments GLEAN3_01942 ###
VWCs, two VWDs and a couple of EGFs - novel architecture
###Gene_Info_Comments GLEAN3_03277 ###
FOUR VWD -also TILand VWC -looks like a mucin
###Gene_Info_Comments GLEAN3_03376 ###
AMOP, VWD,CCP - LOOKS LIKE A MUCIN - MISSING N-TERMINAL NIDO
###Gene_Info_Comments GLEAN3_05406 ###
NIDO, AMOP, VWD AND EGF_CA TM - rather similar structure to mucin4d of chickens
###Gene_Info_Comments GLEAN3_09378 ###
LPD_N, VWD only - common structure for vitellogenin
###Gene_Info_Comments GLEAN3_09395 ###
NIDO/VWD - RATHER MUCIN-LIKE
###Gene_Info_Comments GLEAN3_13189 ###
three VWD domains, also TIL and VWC - looks like a mucin
###Gene_Info_Comments GLEAN3_13334 ###
NIDO, AMOP, VWD,CCP - LOOKS LIKE A MUCIN
###Gene_Info_Comments GLEAN3_15633 ###
NIDO, AMOP, VWD,CCP - LOOKS LIKE A MUCIN
###Gene_Info_Comments GLEAN3_16052 ###
LPD_N, VWD only - common structure for vitellogenin
###Gene_Info_Comments GLEAN3_17361 ###
enormous protein with lots of segments of low complexity (including GLTT repeats) as well as a cluster of VWD/TIL/EGFs/CCPs/EGFs near C-terminus
###Gene_Info_Comments GLEAN3_20247 ###
NIDO, VWD AND EGF_CA TM - very similar structure to mucin4d of chickens
###Gene_Info_Comments GLEAN3_20744 ###
enormous protein - his-rich domain at N-terminus, 3-4 VWD or TIL/VWC domains,  a run of LDLa, a long segment of low complexity and LDLa/FA58C at C-terminus - some compositional similarity with SCO-spondin but not same
###Gene_Info_Comments GLEAN3_21744 ###
FA58C and VWD with a couple of EGFs - novel architecture
###Gene_Info_Comments GLEAN3_24352 ###
NIDO, AMOP, VWD - LOOKS LIKE A MUCIN
###Gene_Info_Comments GLEAN3_27118 ###
VWC/VWD - two repeats -  note that adjacent gene (27119) has multiple repeats of the same kind
###Gene_Info_Comments GLEAN3_27119 ###
VWC/VWD - in multiple repeats - novel architecture - note that adjacent gene (27118) has similar composition
###Gene_Info_Comments GLEAN3_28683 ###
LPD_N, VWD, VWA only - vitellogenin in Anopheles contains extra VWA also
###Gene_Info_Comments GLEAN3_00538 ###
EGF x 5 - CCP - TM 
NOVEL ARCHITECTURE
###Gene_Info_Comments GLEAN3_02986 ###
CCP-CLECT-CCP-EGF-EGF-TM
NOVEL ARCHITECTURE

###Gene_Info_Comments GLEAN3_09610 ###
EGF-EGF-VWD - probably a fragment
###Gene_Info_Comments GLEAN3_00782 ###
VWD only - probably a fragment
###Gene_Info_Comments GLEAN3_16089 ###
EGF-VWD-EGF - probably a fragment
###Gene_Info_Comments GLEAN3_18155 ###
VWD-EGF - probably a fragment
###Gene_Info_Comments GLEAN3_20181 ###
VWF only - almost certainly a fragment
###Gene_Info_Comments GLEAN3_16222 ###
VWF only - almost certainly a fragment
###Gene_Info_Comments GLEAN3_28685 ###
large protein with a single VWD towards the C-terminus
###Gene_Info_Comments GLEAN3_17171 ###
FBG - EGF x2 - CLECT  - novel architecture
no particularly informative Blast hits
###Gene_Info_Comments GLEAN3_23671 ###
novel architecture - several domains characteristic of adhesion proteins - FA58C, MAM, FBG, SR
No homologues but this is a known St.purp cDNA
Pancer,Z. Dynamic expression of multiple scavenger receptor cysteine-rich genes in coelomocytes of the purple sea urchin
Proc. Natl. Acad. Sci. U.S.A. 97 (24), 13156-13161 (2000)
###Gene_Info_Comments GLEAN3_21993 ###
EGF-FBG - similar to C-terminus of GLEAN3_24020 which also has  a pfam:Nacht domain

Novel architecture - FBG may imply role in innate immunity.
###Gene_Info_Comments GLEAN3_06004 ###
has 7 TSP1 repeats, a TM domain and its long putative cytoplasmic domain contains a dual function kinase domain (could it be aPI kinase?)

Anyway, it's a novel architecture - if the gene prediction is correct.
###Gene_Info_Comments GLEAN3_06084 ###
large protein with mixture of multiple CCP, EGFCa and HYR domains and one TSP1 domain at the C-terminus
Novel architecture - shared with GLEAN3_10445 and GLEAN3_19944
###Gene_Info_Comments GLEAN3_10445 ###
large protein with mixture of multiple CCP, EGFCa and HYR domains and one TSP1 domain at the C-terminus
Novel architecture - shared with GLEAN3_06084 and GLEAN3_19944
###Gene_Info_Comments GLEAN3_19944 ###
large protein with mixture of multiple CCP, EGFCa and HYR domains and one TSP1 domain at the C-terminus
Novel architecture - shared with GLEAN3_10445 and GLEAN3_06084
###Gene_Info_Comments GLEAN3_19437 ###
large protein with multiple TSP1, FA58C, gal-lectin and CLECT domains intermingled

novel architecture - similar to GLEAN3_00691 and GLEAN3_05426
###Gene_Info_Comments GLEAN3_04017 ###
NOVEL ARCHITECTURE - WAP-IG-KU-KU-C345C
###Gene_Info_Comments GLEAN3_06068 ###
novel architecture - intermingled SR and FU repeats followed by a series of Ig domains and a TM
looks like a fragment of the predicted gene UPI0000583F83 - similar to deleted in malignant brain tumors 1 isoform c precursor

compare GLEAN3_24528 which is a more complete version of this gene
###Gene_Info_Comments GLEAN3_17202 ###
novel architecture - intermingled EGF/EGFCa/Igv/Igc2 domains followed by a set of VWC domains
###Gene_Info_Comments GLEAN3_22250 ###
looks like a good model of an LRR/Ig membrane receptor - has complete set of LRR with NT and CT domains followed by Igc2 and TM - there are quite a few receptors of this type in humans

BEST MATCH IS LRIG receptors - some homology with Gp-V of platelets
###Gene_Info_Comments GLEAN3_00425 ###
looks like a pretty good model of an LRR/Ig membrane receptor - has complete set of LRR with NT and CT domains followed by Igc2 - no TM - there are quite a few receptors of this type in humans


###Gene_Info_Comments GLEAN3_05538 ###
looks like a pretty good model of an LRR/Ig membrane receptor - has a set of 7 LRR repeats (no NT) and CT domain followed by Igc2 and TM - there are quite a few receptors of this type in humans
###Gene_Info_Comments GLEAN3_00186 ###
looks like a good model of an LRR/Ig membrane receptor - has complete set of LRR with NT and CT domains followed by Igc2 FN3 and TM - there are quite a few receptors of this type in humans
###Gene_Info_Comments GLEAN3_12819 ###
looks like a pretty good model of an LRR/Ig membrane receptor - has a set of 5 LRR (no NT) and a CT domain followed by Igc2 and TM - there are quite a few receptors of this type in humans
###Gene_Info_Comments GLEAN3_20790 ###
looks like a partial model of an LRR/Ig membrane receptor - has 4 LRR repeats followed by a CT domain and an Igc2 - 
lacks LR_NT at N-terminus and TM at C-terminus - there are quite a few receptors of this type in humans
###Gene_Info_Comments GLEAN3_02757 ###
looks like a good model of an LRR/Ig membrane receptor - has complete set of LRR with NT and CT domains followed by 2 Igc2 and TM - there are quite a few receptors of this type in humans
###Gene_Info_Comments GLEAN3_15612 ###
looks like a good model of an LRR/Ig membrane receptor - has complete set of LRR with NT and CT domains followed by Ig and TM - there are quite a few receptors of this type in humans
###Gene_Info_Comments GLEAN3_17564 ###
looks like a good model of an LRR/Ig membrane receptor - has complete set of LRR with NT and CT domains followed by Ig but no TM - there are quite a few receptors of this type in humans
###Gene_Info_Comments GLEAN3_18080 ###
looks like a good model of an LRR/Ig membrane receptor - has complete set of LRR with NT and CT domains followed by Ig but no TM - there are quite a few receptors of this type in humans
###Gene_Info_Comments GLEAN3_01129 ###
looks like a pretty good model of an LRR/Ig membrane receptor - has a set of 12 LRR repeats (no NT) and a CT domain followed by Igc2 and TM - there are quite a few receptors of this type in humans
###Gene_Info_Comments GLEAN3_18608 ###
looks like a partial model of an LRR/Ig membrane receptor - has aset of 4 LRR repeats (no NT) and a CT domain followed by Ig but no TM - there are quite a few receptors of this type in humans
###Gene_Info_Comments GLEAN3_11759 ###
looks like a pretty good model of an LRR/Ig membrane receptor - has a set of LRR repeats (no NT) and a CT domain followed by Igc2 but no TM - there are quite a few receptors of this type in humans
###Gene_Info_Comments GLEAN3_25731 ###
looks like a pretty good model of an LRR/Ig membrane receptor - has a set of 6 LRR repeats (no NT) and CT domains followed by Ig and TM - there are quite a few receptors of this type in humans
###Gene_Info_Comments GLEAN3_11637 ###
looks like a pretty good model of an LRR/Ig membrane receptor - has a set of  6 LRR repeats (no NT) and a CT domain followed by Ig and TM - there are quite a few receptors of this type in humans
###Gene_Info_Comments GLEAN3_04660 ###
looks like a good model of an LRR/Ig membrane receptor - has complete set of LRR with NT and CT domains followed by six Ig domains and TM - there are quite a few receptors of this type in humans
###Gene_Info_Comments GLEAN3_14240 ###
looks like a good model of an LRR/Ig membrane receptor - has complete set of LRR with NT and CT domains followed by Igc2 and TM - there are quite a few receptors of this type in humans
###Gene_Info_Comments GLEAN3_21853 ###
multiple TSP1 repeats with one Igc2 domain in the middle 

possibly an ADAM-TS
###Gene_Info_Comments GLEAN3_23512 ###
four TSP1 repetas and an Ig domain - probably a fragment
###Gene_Info_Comments GLEAN3_10954 ###
e val = 0.0 to NP_999823 from S. purpuratus, but Glean3_10954 is missing aa's at its start and has an insertion.  Gene model still to be modified to agree with cloning data.
e val = 0.0 to NP_055785 from Homo sapiens.
e val = e-125 to C_620048 from Chlamydomonas.
Annotated by RL Morris, A Musante, K Judkins, B Rossetti, A Rawson.
###Gene_Info_Comments GLEAN3_19990 ###
The automated GLEAN prediction for this Gene Model contained a duplication of 76 amino acids near the amino terminus of the predicted protein. On the assumption that this apparent duplication is an assembly error rather than a true sequence duplication, it waqs edited out of the GLEAN sequence.
###Gene_Info_Comments GLEAN3_17705 ###
A sequence containing the conserved motiv PSSALRE, characteristic of this kinase in other species, is missing in GLEAN3_17705.
###Gene_Info_Comments GLEAN3_04406 ###
Similiar to Homo sapiens neuroglobin (NGB) mRNA, complete cds
###Gene_Info_Comments GLEAN3_10004 ###
This is a gene with 19 perfect hyalin repeats in tandem, each present as a separate exon,each with a high pfam score as a hyalin repeat.  Expressed in the embryo at a modest level.
###Gene_Info_Comments GLEAN3_17620 ###
This hyalin-like gene may be incomplete, is expressed in embryos at very low levels, if at all based on the tiling experiment.  It has an EGF - 11 hyalin repeats - EGF.  
###Gene_Info_Comments GLEAN3_12490 ###
The GLEAN3 prediction doesn't include the C-terminus of the protein
###Gene_Info_Comments GLEAN3_28266 ###
Gene accepted as is.  Not expressed in embryo but has 6 EGF repeats, 49 hyalin repeats 
###Gene_Info_Comments GLEAN3_28066 ###
Most likely artifactual (haplotype ?) duplication of GLEAN3_16079 (based on proximity of all other identified Type I Activin Like Receptor  to glean3_16079) 
###Gene_Info_Comments GLEAN3_16079 ###
GLEAN3_28066 seems a artifactual duplication of this gene
It seems also that the first exon of this prediction is an extra one that is not predicyed in the Angerer genescan.
###Gene_Info_Comments GLEAN3_14949 ###
This hyalin-like gene is not expressed in the embryo, has 23 hyalin repeats, each with a low pfam score.
###Gene_Info_Comments GLEAN3_19338 ###
Sequence discrepancies with cDNA data.
GLEAN3 predicts a longer protein.
###Gene_Info_Comments GLEAN3_01500 ###
Expressed in the embryo based on tiling experiment.  Highly conserved at N terminus with rat.  Exons predicted to be correct in glean model
###Gene_Info_Comments GLEAN3_21497 ###
unclear duplication GLEAN3_00669
###Gene_Info_Comments GLEAN3_08552 ###
Gene model corrected (2 exons modified) after careful comparison with vertebrate ortthologs and domain analysis. The 3' part of the gene is supported by EST evidence (CD323084 StrPu537.001446 from 20hr blastula stage library)
###Gene_Info_Comments GLEAN3_22049 ###
GLEAN3 prediction corresponds to the NP_999778.1 cDNA sequence only for the first 2/3 of the sequence
###Gene_Info_Comments GLEAN3_22153 ###
Highly expressed in embryos,  has high homology to rat casein kinase, gene looks to be complete as modeled
###Gene_Info_Comments GLEAN3_04723 ###
It's possible exon 4 is larger than indicated here.
###Gene_Info_Comments GLEAN3_13435 ###
The BRCA2 repeats are highly conserved and the gene is similar to the length of
###Gene_Info_Comments GLEAN3_23599 ###
It is very similar to the P.l Dnmt1, with almost the same number of introns
###Gene_Info_Comments GLEAN3_06612 ###
duplicated exons of Sp-FRAP, see glean3_06053 for gene model
###Gene_Info_Comments GLEAN3_01692 ###
duplication, see glean3_06053 for  gene model
###Gene_Info_Comments GLEAN3_09520 ###
Annotated using P.lividus(AM179826 and CAJ47350) and S.purpuratus cDNA and protein sequences.
Glean 09520 contains the first exons.The C terminal exons have to be taken from Glean 03704 (Meredith Ashby). Protein sequences from the last exon of 09520 and the first exon of 03704 are not encoded in the cDNAs, and conversely a sequence present in both cDNAs is not present in the models. Alternative splicing or erroneous models.
Warning :possible assembly problem. The fragment of the protein sequence missing is coded for by a short sequence from scaffold 2003. This sequence is incorporated in other Glean and NCBI models, within an intron in one case, as an exon read in a different frame, in the other.
Reconstructed protein sequence is given.
Indicated highest blast hit is for non sea urchin sequences.     
###Gene_Info_Comments GLEAN3_06406 ###
The protein encoded by this gene has a thioester site but no cleavage site to separate alpha and beta chains as in Sp-C3.  The protein sequence is too short to be either a complement protien or alpha 2 macroglobulin.

GLEAN3-06406 overlaps with GLEAN3-26313.  An alignment showing this overlap is attached to this annotation.
###Gene_Info_Comments GLEAN3_19422 ###
The protein encoded by this gene has a thioester site, but no beta chain and no histidine to regulate thioester attack on the target sequence.  The sequence is too short to be complement or alpha 2 macroglobulin.  The sequence does not overlap with any other GLEAN sequences.
###Gene_Info_Comments GLEAN3_19612 ###
Similiar to gi|72041679|ref|XM_793147.1|PREDICTED: Strongylocentrotus purpuratus similar to ataxin 7-like 
2 (LOC593678), partial mRNA

###Gene_Info_Comments Sp-Tlr170 ###
Parital Toll-like receptor predicted by FgeneshAB and ++. The nuleotides have 89% identity to a typical Sp-Tlr (GLEAN3_23035). This gene model is located at the end of a contig, making the gene model incomplete.

###Gene_Info_Comments GLEAN3_25786 ###
high expression after UV-B irradiation of embryos
###Gene_Info_Comments Sp-Tlr215 ###
Parital Toll-like receptor predicted by FgeneshAB and Genscan. The nuleotides have 98% identity to a typical Sp-Tlr (GLEAN3_13751). There are 5 LRRs and no more LRRs were found in the 5' upstream region. This gene model is located at the end of a short scaffold.

###Gene_Info_Comments GLEAN3_13654 ###
The protein encoded by this gene has an N-terminal alpha 2 macroglobulin domain.  However, it does not have a thioester site.  There is an alpha/beta chain cleavage site and perhpas another for cleavage between alpha and gamma chains, but the site is in the wrong place.  The sequence is too short to be either a complement protein or alpha 2 macroblobulin.
###Gene_Info_Comments GLEAN3_05856 ###
Exon 2 was missing in the original Glean3 model.  mRNA sequence was obtained from Bill Marzluff.
###Gene_Info_Comments GLEAN3_27840 ###
The protein encoded by this gene matches to an EST with similarities to FKBP-12.
###Gene_Info_Comments GLEAN3_16548 ###
Significant partial overlap with GLEAN3_16547
###Gene_Info_Comments GLEAN3_18378 ###
   e val for NP_999777 = 0.0; KRP85 [Strongylocentrotus purpuratus].  
Updated sequence by replacing peptide seq with NP_999777 and nucleotide sequence with "NM_214612, 2213 bp, mRNA, linear, INV 12-AUG-2005"
   e val = e-135 for C_1880008 (http://genome.jgi-psf.org/Chlre3/Chlre3.home.html), Cr-Fla10 which is same as NCBI P46869
   e val = e-140 for NP_004789, KIF3B [Homo sapiens].
   Annotation by RL Morris, B Rossetti, and A Rawson.

###Gene_Info_Comments GLEAN3_26281 ###
Segment of KRP95, annotated fully in GLEAN3_26280.
Annotation by RL Morris, R.A.Obar, AP Rawson, and B Rossetti.
###Gene_Info_Comments GLEAN3_27012 ###
This glean model encodes the N-terminal part of the protein.  The C-terminus of the protein is contained in Glean3_14185.
###Gene_Info_Comments GLEAN3_14185 ###
Encodes the C-terminus of Wee1; see annotation to Glean3_27012.
###Gene_Info_Comments GLEAN3_11239 ###
e val to Q9P2H3 [Homo sapiens] = 0.0
e val to C_120075 (FAP167, IFT80, Intraflagellar Transport protein 80, http://genome.jgi-psf.org/Chlre3/Chlre3.home.html) = 3e-57
WD repeats.
Annotated by RL Morris.
###Gene_Info_Comments GLEAN3_10564 ###
three and a bit complete LRRNT-LRRn-LRRCT repeats

Together with adjacent gene (GLEAN3_10564) comprises a complete Slit gene - could be one or two exons encoding LRR repeats missing at junction
###Gene_Info_Comments GLEAN3_21581 ###
The start methionine of the ORF described here aligns with the query sequence (Mouse dystrophin) after the first 300 N-terminal amino acids.  GLEAN3 prediction 21580 lies immediately upstream on the same scaffold and encodes  a partial spectrin motif, it is possible that this should be included, however this entry constitutes a well aligned ORF against known homologs. 
###Gene_Info_Comments GLEAN3_21228 ###
This model was annotated based on a manual inspection of multiple protein sequence and domain structure comparisons.

This and a very similar adjacent model (GLEAN3_21229) predict proteins with a domain structure very similar to that of coagulation factors 5 and 8 (long N-terminus of little complexity and a C-terminal Pfam F5_F8_type_C domain). Their C-terminal F5_F8_type_C blasts best to coagulation factor 8. It should be noted, however, that there is a predicted EGF domain at the N-terminus of this model, which is absent from coagulation factors 5 and 8.

Its adjacent model (GLEAN3_21229) is similar in sequence but far from identical, which suggests that these models might represent a true gene duplication event.
###Gene_Info_Comments GLEAN3_21229 ###
This model was annotated based on a manual inspection of multiple protein sequence and domain structure comparisons.

This and a very similar adjacent model (GLEAN3_21228) predict proteins with a domain structure very similar to that of coagulation factors 5 and 8 (long N-terminus of little complexity and a C-terminal Pfam F5_F8_type_C domain). Their C-terminal F5_F8_type_C blasts best to coagulation factor 8. It should be noted, however, that there is a predicted Cadherin domain at the N-terminus of this model, which is absent from coagulation factors 5 and 8.

Its adjacent model (GLEAN3_21228) is similar in sequence but far from identical, which suggests that these models might represent a true gene duplication event.
###Gene_Info_Comments GLEAN3_12598 ###
e val to XP_787973 = e-122.
e val to Q9P2H3 [Homo sapiens] = e-73
Some sililarity to FAP167, IFT80, Intraflagellar Transport protein 80, http://genome.jgi-psf.org/Chlre3/Chlre3.home.html).
Annotated by RL Morris.
###Gene_Info_Comments GLEAN3_26298 ###
one complete LRR unit LRR-NT/17LRR/LRR-CT

adjacent gene (GLEAN3_26299) has very similar structure - probably part of the same LRR protein
###Gene_Info_Comments GLEAN3_26299 ###
one complete LRR unit LRR-NT/11LRR/LRR-CT

adjacent gene (GLEAN3_26298) has very similar structure - probably part of the same LRR protein
###Gene_Info_Comments GLEAN3_00831 ###
Similar to R-PTP-delta.  Partial sequence. May be a portion of a duplicate gene.  Another Sp-R-PTP-delta, GLEAN3_13607, is not on the same scaffold.  GLEAN3_13607 is probably a duplicate.
See putative conserved domains at:
http://www.ncbi.nlm.nih.gov/Structure/cdd/wrpsb.cgi?RID=1137535563-11783-53029041881.BLASTQ1
###Gene_Info_Comments GLEAN3_00164 ###
One of 59 models with only one clectin motif and no others
###Gene_Info_Comments GLEAN3_00294 ###
One of 59 models with only one clectin motif and no others
###Gene_Info_Comments GLEAN3_00743 ###
One of 59 models with only one clectin motif and no others
###Gene_Info_Comments GLEAN3_01182 ###
One of 59 models with only one clectin motif and no others
###Gene_Info_Comments GLEAN3_01274 ###
One of 59 models with only one clectin motif and no others
###Gene_Info_Comments GLEAN3_02144 ###
One of 59 models with only one clectin motif and no others
###Gene_Info_Comments GLEAN3_02697 ###
One of 59 models with only one clectin motif and no others
###Gene_Info_Comments GLEAN3_03227 ###
One of 59 models with only one clectin motif and no others
###Gene_Info_Comments GLEAN3_03618 ###
One of 59 models with only one clectin motif and no others
###Gene_Info_Comments GLEAN3_03774 ###
One of 59 models with only one clectin motif and no others
###Gene_Info_Comments GLEAN3_04015 ###
One of 59 models with only one clectin motif and no others
###Gene_Info_Comments GLEAN3_05111 ###
One of 59 models with only one clectin motif and no others
###Gene_Info_Comments GLEAN3_05594 ###
One of 59 models with only one clectin motif and no others
###Gene_Info_Comments GLEAN3_05989 ###
One of 59 models with only one clectin motif and no others
###Gene_Info_Comments GLEAN3_05991 ###
One of 59 models with only one clectin motif and no others
###Gene_Info_Comments GLEAN3_06508 ###
One of 59 models with only one clectin motif and no others
###Gene_Info_Comments GLEAN3_07040 ###
One of 59 models with only one clectin motif and no others
###Gene_Info_Comments GLEAN3_07576 ###
One of 59 models with only one clectin motif and no others
###Gene_Info_Comments GLEAN3_07766 ###
One of 59 models with only one clectin motif and no others
###Gene_Info_Comments GLEAN3_08393 ###
One of 59 models with only one clectin motif and no others
###Gene_Info_Comments GLEAN3_09504 ###
One of 59 models with only one clectin motif and no others
###Gene_Info_Comments GLEAN3_10100 ###
One of 59 models with only one clectin motif and no others
###Gene_Info_Comments GLEAN3_10101 ###
One of 59 models with only one clectin motif and no others
###Gene_Info_Comments GLEAN3_10212 ###
One of 59 models with only one clectin motif and no others
###Gene_Info_Comments GLEAN3_10297 ###
One of 59 models with only one clectin motif and no others
###Gene_Info_Comments GLEAN3_11163 ###
One of 59 models with only one clectin motif and no others
###Gene_Info_Comments GLEAN3_12314 ###
One of 59 models with only one clectin motif and no others
###Gene_Info_Comments GLEAN3_12942 ###
One of 59 models with only one clectin motif and no others
###Gene_Info_Comments GLEAN3_13649 ###
One of 59 models with only one clectin motif and no others
###Gene_Info_Comments GLEAN3_14081 ###
One of 59 models with only one clectin motif and no others
###Gene_Info_Comments GLEAN3_14082 ###
One of 59 models with only one clectin motif and no others
###Gene_Info_Comments GLEAN3_14184 ###
One of 59 models with only one clectin motif and no others
###Gene_Info_Comments GLEAN3_14222 ###
One of 59 models with only one clectin motif and no others
###Gene_Info_Comments GLEAN3_15079 ###
One of 59 models with only one clectin motif and no others
###Gene_Info_Comments GLEAN3_15211 ###
One of 59 models with only one clectin motif and no others
###Gene_Info_Comments GLEAN3_17295 ###
One of 59 models with only one clectin motif and no others
###Gene_Info_Comments GLEAN3_17810 ###
One of 59 models with only one clectin motif and no others
###Gene_Info_Comments GLEAN3_17860 ###
One of 59 models with only one clectin motif and no others
###Gene_Info_Comments GLEAN3_19213 ###
One of 59 models with only one clectin motif and no others
###Gene_Info_Comments GLEAN3_19575 ###
One of 59 models with only one clectin motif and no others
###Gene_Info_Comments GLEAN3_20517 ###
One of 59 models with only one clectin motif and no others
###Gene_Info_Comments GLEAN3_22152 ###
One of 59 models with only one clectin motif and no others
###Gene_Info_Comments GLEAN3_22396 ###
One of 59 models with only one clectin motif and no others
###Gene_Info_Comments GLEAN3_22861 ###
One of 59 models with only one clectin motif and no others
###Gene_Info_Comments GLEAN3_23797 ###
One of 59 models with only one clectin motif and no others
###Gene_Info_Comments GLEAN3_24127 ###
One of 59 models with only one clectin motif and no others
###Gene_Info_Comments GLEAN3_24382 ###
One of 59 models with only one clectin motif and no others
###Gene_Info_Comments GLEAN3_25181 ###
One of 59 models with only one clectin motif and no others
###Gene_Info_Comments GLEAN3_25184 ###
One of 59 models with only one clectin motif and no others
###Gene_Info_Comments GLEAN3_25248 ###
One of 59 models with only one clectin motif and no others
###Gene_Info_Comments GLEAN3_25874 ###
One of 59 models with only one clectin motif and no others
###Gene_Info_Comments GLEAN3_25875 ###
One of 59 models with only one clectin motif and no others
###Gene_Info_Comments GLEAN3_25892 ###
One of 59 models with only one clectin motif and no others
###Gene_Info_Comments GLEAN3_26524 ###
One of 59 models with only one clectin motif and no others
###Gene_Info_Comments GLEAN3_27079 ###
One of 59 models with only one clectin motif and no others
###Gene_Info_Comments GLEAN3_28229 ###
One of 59 models with only one clectin motif and no others
###Gene_Info_Comments GLEAN3_28432 ###
One of 59 models with only one clectin motif and no others
###Gene_Info_Comments GLEAN3_28538 ###
One of 59 models with only one clectin motif and no others
###Gene_Info_Comments GLEAN3_28564 ###
One of 59 models with only one clectin motif and no others
###Gene_Info_Comments Sp-Tfpi-like ###
This model was created based on a Fgenesh++ prediction on scaffold98422 and a manual inspection and comparison of the predicted protein to similar genes in other groups.

Based on our analysis thus far, this model likely corresponds to a partial prediction, as indicated by the similarity of the available sequence from this model to Tissue factor pathway inhibitor genes in vertebrates.  
###Gene_Info_Comments GLEAN3_00786 ###
It could be a partial sequence!
###Gene_Info_Comments GLEAN3_15878 ###
This model was annotated based on a manual inspection of multiple protein sequence alignments and domain structure analyses.

The structure of this model is supported by the fact that other gene prediction protocols generated identical models.

The domain structure of this model differs slightly from that of vertebrate kallikrein B1 genes, in that no Pfam PAN domain is predicted in its N-terminus. Otherwise, the size and structure of this prediction is similar to that of kallikrein B1.
###Gene_Info_Comments GLEAN3_06723 ###
See also GLEAN3_05592 and GLEAN3_24688.
See putative conserved domains at:
http://www.ncbi.nlm.nih.gov/Structure/cdd/wrpsb.cgi?RID=1137537488-29699-97084085825.BLASTQ4
###Gene_Info_Comments GLEAN3_08466 ###
Blasts to PTPRT, but doesn't clade with the PTPR K/M/T/U group in phylogenetic analysis using PTPc domains 1 or 2.  Renamed PTPRorph2.
See putative conserved domains at:
http://www.ncbi.nlm.nih.gov/Structure/cdd/wrpsb.cgi?RID=1137537867-27665-146362170620.BLASTQ4
###Gene_Info_Comments GLEAN3_08878 ###
Partial sequence.  Similar to Survivin 2.
See putative conserved domains at:
http://www.ncbi.nlm.nih.gov/Structure/cdd/wrpsb.cgi?RID=1137538166-23506-35039912798.BLASTQ4
###Gene_Info_Comments GLEAN3_00276 ###
Transcriptome data indicates that Glean may have falsely predicted the following exons: 3.The tiling data indicates that in the vicinity of the gene are significant ORFs which might represent additional exons. ORFs: RERERERERWRDGRGEREIATERERERERERERERERKRNMERERERERERERERIGNKSEYGIVRYVXXXXXXXXXXXXXXXXXGGGRDSIPFIENP
###Gene_Info_Comments GLEAN3_14528 ###
Transcriptome data indicates that Glean may have falsely predicted the following exons: 2.
###Gene_Info_Comments GLEAN3_12586 ###
Transcriptome data indicates that Glean may have falsely predicted the following exons: 2,5,6,7.
###Gene_Info_Comments GLEAN3_13962 ###
Transcriptome data indicates that Glean may have falsely predicted the following exons: 2.

C-terminal has likely been wrongly attached to glean3_08353.
###Gene_Info_Comments GLEAN3_04028 ###
Matches_GLEAN3_04028. Transcriptome data indicates that Glean may have falsely predicted the following exons: 1.
###Gene_Info_Comments GLEAN3_02815 ###
The tiling data indicates that in the vicinity of the gene are significant ORFs which might represent additional exons. ORFs: KMEYILRERQLRLGRRLQCHHIHPQMSKATKSNQLQWHHIPMGTHNRLVLTAKLSSRRNGNFSVTIRGILFVTIILLLVIVVVTFSVGFVMVFLTDKWPRRYQSYPVIGDDWIKSGFRKRTESRINLPTYTYYGY,FVPSEHLLNARLSAVEKSVGKALISGVSSIHGQNQPLTDQAKPQDLNQEDNQTTAQPTQPTQQDEGDDSGITHQPLNVTTDSIEDGVHTEGTTTQVGQETAMPPHTSANVKGDQKQPTTMAPHTNGDSQPPSADGEVVIKAKREFLGDHPRYSFRDNNPFVGDRRGDVLGRLRDGLPHRQVASQVPVLPRHRRRLDQKWFQKKDGIEN
###Gene_Info_Comments GLEAN3_22816 ###
The tiling data indicates that in the vicinity of the gene are significant ORFs which might represent additional exons. ORFs: HPSSSSPLYTPNFSSPSSSSLPSVYLFPKSFIQPLLYLQHRPLLLNNIVLLLLSFLTPSTTSSSSSSPSLISFFFSSSIFSKIVFPSSSSPPQPPPKLPPPPPPPLLPNPFHNHILQYFLPPPLPCLLCIPFPQSFIQPLLYLHLKLRLFHSSSSPLFSYLL,FCSSSSSSSSCSSSFSSSTTSPPPPLLPNPLHNFLLSLFLLDILLLLLLYILQIFLPPPPPPCLQYISFLRVSSNRSCTCNIAPSS,EFHPTAPVPATSPPPPKQHRPPPPLLPNPLHNFVLLFIPLLDILLLLLLHILQNCLPLLLLSSPTPSKTSSSSSSSSPSQPLPQPYSPIFSSPSSPLPSVYPLSSEFHPTAPVPALKAPPFPLLLLSPFLLSP
###Gene_Info_Comments GLEAN3_23386 ###
The tiling data indicates that in the vicinity of the gene are significant ORFs which might represent additional exons. ORFs: AKMLAFITSHRTKARGKKEYDIVFPALFSWLYRVCFLAKRFIYAKQPQAMKPGGIGCLNEEGRKVLYKFLSFSFSPSTPILHAQINSRH,MFFVSPTNSTPNFSLATIRPRVVSAIALAAGLHPSNSIQHKSVKPSLGECRFEAEASSLTSFDLCLFALRFPLSLHSDIA,IRAYITVYNINIYVRHLEYDSSTNTLCVYVLSPSLYYSPSPSYLPLPISLSPSPFPKYFYVAYNALVISYRTLVDIFLGFCLNVFCFAN,LIHHPDMPAEIFTTPTSNTDTLLVQPRHFVYTIISIRPHAYKLMNPPPPSTSLFLSPSPSSTPFYYFFEVFPLAFRALVR,ERPNICPLASPDLSVLTKEAPVRINRTLLFLSLSPDPGPDNIQGISKSHSLFSLYFLSFFFQSPIPPPPPSHHPPQISSLP,SFTSIKYICLKDCSTISAHYPKHAHYQADTLLWFSLHSRMRRRWAIYCTRKIITRIPNDRAATLPDIHLFSLPASLPPTRYSPVIVH,IYLFKRLFYNICSLSQARTLSGRYPTVVLTPFTHEEKMGNLLHEENHNENSKRQGSNLARHTPVLLTCLIASYQVFTCNRPLRRDNSLPLSTVTDALFDAYAEVY
###Gene_Info_Comments GLEAN3_07882 ###
Matches c-type lectin domain (cd00037).
###Gene_Info_Comments GLEAN3_22405 ###
Ig-EGF-FN3 - a domain sequence unique to Tie1/2 in chordates
Lacks TM and kinase domains - suspect this is missing 3' end of gene.
###Gene_Info_Comments GLEAN3_10943 ###
The GLEAN model directly corresponds to the previously cloned sea urchin (purpuratus) fascin 1 gene.
###Gene_Info_Comments GLEAN3_23200 ###
variant b, 2 DSRM domains, the other gene has just one. 
###Gene_Info_Comments GLEAN3_10068 ###
one DSRM domain
###Gene_Info_Comments GLEAN3_17592 ###
partial CDS containing an exon encoding part of the protease domain.  Nucleic acid sequence is very close to that of GLEAN3_11551.
###Gene_Info_Comments GLEAN3_18708 ###
partial cds on a short scaffold.  Nucleic acid sequence is identical to that of GLEAN3_28742.
###Gene_Info_Comments GLEAN3_19655 ###
partial CDS; predicted exons supported by weak signals on the transcriptome.
###Gene_Info_Comments GLEAN3_06214 ###
PREDICTED: Strongylocentrotus purpuratus similar to CG11793-PA (LOC579361), mRNA Length=893

###Gene_Info_Comments GLEAN3_07151 ###
PREDICTED: Strongylocentrotus purpuratus similar to CG1548-PA (LOC575021), mRNA

###Gene_Info_Comments GLEAN3_15385 ###
PREDICTED: Strongylocentrotus purpuratus similar to lipocalin 7 (LOC576823), mRNA

###Gene_Info_Comments GLEAN3_21839 ###
GLEAN3_21839 encodes a partial 3'-terminal sequence of the Cdk2 mRNA
The sequence is entirely contained in GLEAN3_07655, Refer to it for further annotation
###Gene_Info_Comments GLEAN3_01005 ###
4 SRCR domains. Hits DMBT1 (probably superficial).  Probably incomplete model.  Possibly part of GLEAN3_01004.
###Gene_Info_Comments GLEAN3_01004 ###
TM-SRCR(8).  Probably partial model.  Possiby part of gene that includes GLEAN3_01005.  
###Gene_Info_Comments GLEAN3_00984 ###
SRCR(5).  Probably incomplete.
###Gene_Info_Comments GLEAN3_00672 ###
Two gene models seem to be fused together. The first 7 exons appear to belong to the NLR gene, although the first exon is also questionable.
Domains: DEATH-NACHT-LRRs

###Gene_Info_Comments GLEAN3_13206 ###
The Genscan model contains additional sequences at the 5' end and seems more accurate. The gene features and sequences were annotated according to this model.
Domains: DEATH-NACHT-PYD-LRRs 
###Gene_Info_Comments GLEAN3_24528 ###
novel architecture - intermingled SR and FU repeats followed by a series of Ig domains and a TM
looks like a more complete version of the predicted gene UPI0000583F83 - similar to deleted in malignant brain tumors 1 isoform c precursor

compare GLEAN3_06068 - a duplication of the C-terminal half of this gene
###Gene_Info_Comments GLEAN3_06952 ###
has all the extracellular features of a toll receptor - no TIR in cyto domain - could be incomplete gene or simply an LRR receptor

Note - adjacent gene (GLEAN3_06951) is very similar
###Gene_Info_Comments GLEAN3_06951 ###
has all the extracellular features of a toll receptor - no TIR in cyto domain - could be incomplete gene or simply an LRR receptor

Note - adjacent gene (GLEAN3_06952) is very similar
###Gene_Info_Comments GLEAN3_21178 ###
has all the extracellular features of a toll receptor - no TIR in cyto domain - could be incomplete gene or simply an LRR receptor
###Gene_Info_Comments GLEAN3_01106 ###
has all the extracellular features of a toll receptor - no TIR in cyto domain - could be incomplete gene or simply an LRR receptor
###Gene_Info_Comments GLEAN3_21751 ###
has all the extracellular features of a toll receptor - no TIR in cyto domain - could be incomplete gene or simply an LRR receptor
###Gene_Info_Comments GLEAN3_14039 ###
has the extracellular features of a toll receptor ( a bit spaced out) - no TIR in cyto domain - could be incomplete gene or simply an LRR receptor
###Gene_Info_Comments GLEAN3_14845 ###
has the extracellular features of a toll receptor ( a bit spaced out) - no TIR in cyto domain - could be incomplete gene or simply an LRR receptor
###Gene_Info_Comments GLEAN3_07065 ###
has the extracellular features of a toll receptor BUT no predicted TM and no TIR in putative cyto domain - could be incomplete gene or simply an LRR receptor
###Gene_Info_Comments GLEAN3_22463 ###
has the extracellular features of a toll receptor ( a bit spaced out) BUT no predicted TIR in putative cyto domain - could be incomplete gene or simply an LRR receptor
###Gene_Info_Comments GLEAN3_18237 ###
has the extracellular features of a toll receptor ( maybe fewere LRRs - could be missing N-terminus) BUT no predicted TIR in putative cyto domain - could be incomplete gene or simply an LRR receptor
###Gene_Info_Comments GLEAN3_01436 ###
has the extracellular features of a toll receptor ( maybe fewer LRR repeats)) BUT no predicted TIR in putative cyto domain - could be incomplete gene or simply an LRR receptor
###Gene_Info_Comments GLEAN3_18238 ###
has the extracellular features of a toll receptor ( maybe fewer LRR repeats)) BUT no predicted TIR in putative cyto domain - could be incomplete gene or simply an LRR receptor
###Gene_Info_Comments GLEAN3_20527 ###
has the extracellular features of a toll receptor BUT no predicted TM or TIR cyto domain - could be incomplete gene or simply an LRR receptor
###Gene_Info_Comments GLEAN3_19916 ###
has the extracellular features of a toll receptor ( maybe fewer LRR repeats)) BUT no predicted TM or TIR cyto domain - could be incomplete gene or simply an LRR receptor
###Gene_Info_Comments GLEAN3_26114 ###
has the extracellular features of a toll receptor ( maybe a bit spaced out) BUT no predicted TM or TIR cyto domain - could be incomplete gene or simply an LRR receptor
###Gene_Info_Comments GLEAN3_08477 ###
has the extracellular features of a toll receptor ( maybe a few more than usual LRRs) BUT no predicted TM or TIR cyto domain - could be incomplete gene or simply an LRR receptor
###Gene_Info_Comments GLEAN3_00876 ###
SRCR(2).  Probably partial. 
###Gene_Info_Comments GLEAN3_07476 ###
looks rather like  an LRR/Ig membrane receptor - a set of LRR with NT a CT domain followed by Ig-like and TM - lacks an LRR-NT domain - could be incomplete

BEST MATCH IS LRIG receptors 
###Gene_Info_Comments GLEAN3_24251 ###
looks like a good model of an LRR/Ig membrane receptor - has complete set of LRR with NT and CT domains followed by Ig-like and TM - there are quite a few receptors of this type in humans

BEST MATCH IS LRIG receptors - some homology with Gp-V of platelets
###Gene_Info_Comments GLEAN3_01749 ###
has the extracellular features of a toll receptor (plus an LRR-NT domain at N-terminus) BUT no predicted TM or TIR cyto domain - could be incomplete gene or simply an LRR receptor
###Gene_Info_Comments GLEAN3_26296 ###
has the extracellular features of a toll receptor ( maybe a few more than usual LRRs and has an N-terminal LRR-NT domain) BUT no predicted TIR cyto domain - could be incomplete gene or simply an LRR receptor
###Gene_Info_Comments GLEAN3_00740 ###
Signal Peptide-SRCR(4)-TM.
###Gene_Info_Comments GLEAN3_00654 ###
Signal Peptide-SRCR(2).  Possibly partial
###Gene_Info_Comments GLEAN3_02925 ###
has partial LRR unit - LRR4/LRRCT and TM - could be fragment of a toll receptor or of another type of LRR receptor
###Gene_Info_Comments GLEAN3_00646 ###
Sig Pep - SRCR(5).  Possibly partial.
###Gene_Info_Comments GLEAN3_07463 ###
one complete LRR unit LRR-NT/21LRR/LRR-CT
###Gene_Info_Comments GLEAN3_01172 ###
SRCR(3)-TM.  Probably partial. (DMBT1)
###Gene_Info_Comments GLEAN3_01177 ###
SRCR(2).  Probably Partial.(DMBT1)
###Gene_Info_Comments GLEAN3_01229 ###
SRCR(5). Probaly partial. (DMBT1)
###Gene_Info_Comments GLEAN3_01266 ###
SRCR(5)-TM. Probably partial.  (DMBT1)
###Gene_Info_Comments GLEAN3_01601 ###
F5_F8_type_C(1)-SRCR(3). Probably partial.
###Gene_Info_Comments GLEAN3_08669 ###
Putative conserved TAFII28 domain detected in BlastP search.

Best Genbank hit was a predicted protein similar to TFIID subunit 11 in S. purpuratus (XP_789830; 477 bits, 3e-133.

Best Genbank empirical support for TFIID subunit 11 is TAF11 RNA polymerase II, TATA box binding protein (TBP)-associated factor, 28kDa [Homo sapiens], accession AAV38212, score 170 bits, 1e-40.

All exons are supported by tiling path array data.
Exonerate and Splign both support expression of this protein.
However, query of Poustka's database does not retrieve significant hits.

Modified glean model by accepting Davidson's 3' UTR. No other changes were made.
###Gene_Info_Comments GLEAN3_08435 ###
fragment of an Igc2/FN3 protein - 3 Ig and 1 FN3 - no TM
###Gene_Info_Comments GLEAN3_01727 ###
SRCR(9). Probably partial. (DMBT1)
###Gene_Info_Comments GLEAN3_01763 ###
SigPep-SRCR(2). Probably partial.(Hensin/DMBT1)
###Gene_Info_Comments GLEAN3_07715 ###
EGF-Ca x3 - Igc2 x2 - FN3 - probably a fragment - this domain organisation is not particularly informative as to identity

looks a little like a fragment of Tie 1/2 but the exact domains are not quite right
###Gene_Info_Comments GLEAN3_07629 ###
Ig-EGF-FN3 - probably a fragment - this domain organisation is not particularly informative as to identity
###Gene_Info_Comments GLEAN3_01863 ###
SRCR(2). Probably partial. (DMBT1)
###Gene_Info_Comments GLEAN3_14732 ###
Blasted protein sequence of human gene NM_005645 against Baylor to obtain Glean3_14732.

Blasted glean gene against NCBI. Putative conserved domains detected: TFIID-18kDa (Pfam). Best Genbank hit: XP_796890. Best empirical data support (provisional acceptance at NCBI): TAF13 RNA polymerase II, TATA box binding protein (TBP)-associated factor, 18kDa [Accession: NP_001016066, cDNA, Xenopus tropicalis; bits 145, 4e-34]. and an unknown protein sequence [Accession: AAH74456, cDNA, Xenopus laevis; score, bits 145, 5e-34].

Exonerate and Splign data exist that support all four exons of the gene. Poustka's database lacks support for any exon. Tiling path array data support exons 2-4, but are inconclusive for the first exon.

I omitted Davidson's 3' UTR because it was HUGE: longer than all the exons of the CDS combined. If the 3'UTR modification were accepted, the resultant gene would contain both Glean3_14732 (this gene) and Glean3_14733 (the next gene on the scaffold in the 3' direction).




###Gene_Info_Comments GLEAN3_02028 ###
SRCR(2). Probably partial. (DMBT1)
###Gene_Info_Comments GLEAN3_02041 ###
SigPep-SRCR(3)-TM. (DMBT1)
###Gene_Info_Comments GLEAN3_02350 ###
SRCR(13). Probably partial. (DMBT1)
###Gene_Info_Comments GLEAN3_24857 ###
six Igc2 repeats - probably a fragment of some adhesion protein or receptor
###Gene_Info_Comments GLEAN3_08622 ###
eight Ig and Igv repeats - probably a fragment of some adhesion protein or receptor

C-terminal DEATH domain could be an artefact - no known proteins with this structure
###Gene_Info_Comments GLEAN3_03127 ###
SRCR(4)-TM. Probably partial. (DMBT1)
###Gene_Info_Comments GLEAN3_13889 ###
seven Ig-like repeats and a heme peroxidase domain - homolog of peroxidasin, although missing LRR repeats at N-terminus
###Gene_Info_Comments GLEAN3_03384 ###
SRCR(4). Probably incomplete. 
###Gene_Info_Comments GLEAN3_12354 ###
nine Igc2 and Ig repeats - probably a fragment of some adhesion protein or receptor
###Gene_Info_Comments GLEAN3_03526 ###
SRCR(6). Probably partial. (hensin/dmbt1)
###Gene_Info_Comments GLEAN3_20291 ###
seven Ig and Igc2 repeats - probably a fragment of some adhesion protein or receptor
###Gene_Info_Comments GLEAN3_06024 ###
fifteen Ig Igc2 and Igv repeats - probably a fragment of some adhesion protein or receptor
###Gene_Info_Comments GLEAN3_03778 ###
SigPep-SRCR(6). Probably incomplete. (DMBT1)
###Gene_Info_Comments GLEAN3_03930 ###
SRCR(10). Probably partial. (DMBT1)
###Gene_Info_Comments GLEAN3_12352 ###
nine Ig and Igc2 repeats - probably a fragment of some adhesion protein or receptor
###Gene_Info_Comments GLEAN3_03963 ###
SRCR(6)-TM. Probably partial. (DMBT1). Maybe continuous with GLEAN3_03964, GLEAN3_03965, GLEAN3_03966
###Gene_Info_Comments GLEAN3_22647 ###
eight Ig and Igc2 repeats - probably a fragment of some adhesion protein or receptor
###Gene_Info_Comments GLEAN3_27956 ###
six Ig and Ig-like repeats - probably a fragment of some adhesion protein or receptor
###Gene_Info_Comments GLEAN3_25432 ###
fourteen Ig and Igc2 repeats - probably a fragment of some adhesion protein or receptor
###Gene_Info_Comments GLEAN3_03964 ###
SRCR(3)-TM. Probably partial. (DMBT1). Maybe continuous with GLEAN3_03963, GLEAN3_03965, GLEAN3_03966
###Gene_Info_Comments GLEAN3_10123 ###
eight Ig and Igc2 repeats - probably a fragment of some adhesion protein or receptor
###Gene_Info_Comments GLEAN3_03965 ###
SRCR(2). Probably partial. (DMBT1). Maybe continuous with GLEAN3_03963, GLEAN3_03964, GLEAN3_03966
###Gene_Info_Comments GLEAN3_15581 ###
eight Ig and Igc2 repeats - probably a fragment of some adhesion protein or receptor
###Gene_Info_Comments GLEAN3_03966 ###
SRCR(3)-TM. Probably partial. (DMBT1). Maybe continuous with GLEAN3_03963, GLEAN3_03964, GLEAN3_03965
###Gene_Info_Comments GLEAN3_22021 ###
six Igc2 repeats - probably a fragment of some adhesion protein or receptor

long low-complexity sequence preceding Ig domains is suspicious 
###Gene_Info_Comments GLEAN3_00532 ###
eight Ig and Igc2 repeats - probably a fragment of some adhesion protein or receptor
###Gene_Info_Comments GLEAN3_12350 ###
ten Ig and Igc2 repeats - probably a fragment of some adhesion protein or receptor
###Gene_Info_Comments GLEAN3_06022 ###
eleven Ig and Igc2 repeats - probably a fragment of some adhesion protein or receptor
###Gene_Info_Comments GLEAN3_12355 ###
six Ig and Igc2 repeats - probably a fragment of some adhesion protein or receptor
###Gene_Info_Comments GLEAN3_00453 ###
ten Igc2 repeats - probably a fragment of some adhesion protein or receptor
###Gene_Info_Comments GLEAN3_13978 ###
six Ig and Igc2 repeats - probably a fragment of some adhesion protein or receptor
###Gene_Info_Comments GLEAN3_21326 ###
six Ig and Igc2 repeats - probably a fragment of some adhesion protein or receptor
###Gene_Info_Comments GLEAN3_21747 ###
fourteen Ig and Igc2 repeats - probably a fragment of some adhesion protein or receptor
###Gene_Info_Comments GLEAN3_00533 ###
eight Ig and Igc2 repeats - probably a fragment of some adhesion protein or receptor
###Gene_Info_Comments GLEAN3_21904 ###
seven Ig and Igc2 repeats - probably a fragment of some adhesion protein or receptor
###Gene_Info_Comments GLEAN3_08653 ###
seven Ig and Igc2 repeats - probably a fragment of some adhesion protein or receptor

long low-complexity sequence surrounding Ig domains is suspicious 
###Gene_Info_Comments GLEAN3_27927 ###
Exon 2 of the Glean model does not Blast to NLR proteins. May be erroneous. There are unresolved (NNN) sequences within this model as well.
Domains: NACHT, DEATH, LRR

###Gene_Info_Comments GLEAN3_03942 ###
Homo sapiens mRNA for polyglutamine binding protein variant 14(PQBP1 gene).
###Gene_Info_Comments GLEAN3_11039 ###
Strongylocentrotus purpuratus calcium-binding protein (Endo16), mRNA.

###Gene_Info_Comments GLEAN3_27335 ###
PREDICTED: Strongylocentrotus purpuratus similar to Galanin receptor type 2 (GAL2-R) (GALR2) (LOC580125), mRNA
Length=1029

###Gene_Info_Comments GLEAN3_11732 ###
Strongylocentrotus purpuratus similar to gamma-aminobutyric acid  A receptor, epsilon (LOC585340), mRNA
###Gene_Info_Comments GLEAN3_08212 ###
PREDICTED: Strongylocentrotus purpuratus similar to Somatostatin receptor type 2 (SS2R) (SRIF-1) (LOC581556), mRNA
###Gene_Info_Comments GLEAN3_16241 ###
PREDICTED: Strongylocentrotus purpuratus similar to Glycogenin-1 (LOC589298), partial mRNA

###Gene_Info_Comments GLEAN3_27885 ###
PREDICTED: Strongylocentrotus purpuratus similar to ceruloplasmin (LOC586705), mRNA

###Gene_Info_Comments GLEAN3_20534 ###
PREDICTED: Strongylocentrotus purpuratus similar to Surfeit locus protein 1 (LOC592318), mRNA

###Gene_Info_Comments GLEAN3_19373 ###
PREDICTED: Strongylocentrotus purpuratus similar to surfeit 5 isoform b (LOC583077), mRNA.

###Gene_Info_Comments GLEAN3_18029 ###
TSPN and TSP1 domain combination. This domain combination is usually found in subgroup A thrombospondins. One fly protein has just these 2 domains.
The TSPN domain is most closely related to the one found in human collagen11a1.
The gene prediction probably includes some repetative sequence elements
###Gene_Info_Comments GLEAN3_24181 ###
Likely C-terminal truncation due to end of contig
###Gene_Info_Comments GLEAN3_25559 ###
Model contains only a TSPN domain but blast suggests that this domain is related to the Col15/18 family. The gene model is probably partial.
###Gene_Info_Comments GLEAN3_04595 ###
Partial glean model for Nek8, duplicate of a longer glean model GLEAN3_05411
###Gene_Info_Comments GLEAN3_28861 ###
Two overlapping glean predictions match DAP5 on scaffold 22799 (glean3_23932) and scaffold 105341 (glean3_28861). 
New gene model proposed:
Exon 1 Scaffold105341|5215|5219|+
Exon 2 Scaffold105341|6813|6935|+
Exon 3 Scaffold105341|8718|8820|+ 
Exon 4 Scaffold22799|7457|7567|+
Exon 5 Scaffold22799|7779|7805|+
Exon 6 Scaffold22799|8433|8499|+
Exon 7 Scaffold22799|8977|9069|+
Exon 8 Scaffold22799|9731|9789|+
Exon 9 Scaffold22799|11308|11490|+
Exon 10 Scaffold22799|11835|11961|+
Exon 11 Scaffold22799|13490|13579|+
Exon 12 Scaffold22799|15302|15734|+
Exon 13 Scaffold22799|16419|16569|+
Exon 14 Scaffold22799|18222|18533|+
Exon 15 Scaffold22799|19645|19860|+
Exon 16 Scaffold22799|20736|21041|+
Exon 17 Scaffold22799|21914|22071|+
Exon 18 Scaffold22799|23074|23282|+
Exon 19 Scaffold22799|23856|23977|+
Exon 20 Scaffold22799|24686|24757|+
Exon 4 to 8 are present in both saffolds.  Homologous sequences to exon 5 are found twice nearby (eg. on scaffold22799, 7657-7683 and 8050-8076), multiple isoforms?
Exon 15 is also duplicated in GLEAN3_04171.
###Gene_Info_Comments GLEAN3_11191 ###
The encoded aa sequence is entirey contained in GLEAN3_03528. 
In GLEAN3_11191 the sequence encoded by exons 3 and 4 of GLEAN3_03528 are missing.

###Gene_Info_Comments GLEAN3_23932 ###
Two overlapping glean predictions match DAP5 on scaffold 22799 (glean3_23932) and scaffold 105341 (glean3_28861). 
see glean3_28861 for gene model
###Gene_Info_Comments GLEAN3_15935 ###
There is only one gene representing SNRPA and U2B" in Urchin.
###Gene_Info_Comments GLEAN3_05411 ###
Partial gene model. The Cter part of the protein is on glean 04595 (duplicated exons identical). The Nterminal part of the protein is missing (not found on this genome assembly), the kinase domain is therefore incomplete. beginning of Glean3_05411 is located at the border of the scaffold and do not reflect the real Nter.
###Gene_Info_Comments GLEAN3_28068 ###
PREDICTED: Strongylocentrotus purpuratus similar to large conductance calcium-activated potassium channel subfamily M alpha member 1 isoform b (LOC578468), mRNA

###Gene_Info_Comments GLEAN3_10354 ###
GLEAN3_10354 lies on minus strand of Scaffold90906. Protein is probably missing N-ter (comparative analysis of transmembrane helices in the prot family). 
Analysis of genomic region 5' of glean prediction (used Genescan and GeneMark to look for additional exons): not conclusive.
Scaffold90906 seq is incomplete 5' to glean_10354, additional exons may lie there. Alternatively, N-ter may possibly lie on GLEAN3_06608 (Scaffold109591).


###Gene_Info_Comments GLEAN3_22934 ###
This model was annotated based on multiple protein sequence alignments and a manual analysis of predicted domain structures.

The annotation of this model is supported by reciprocal blasting and the observation that the domain structure of this model is similar to that of plasminogen genes. It should be noted, however, that given the position of this model in the current assembly (in a scaffold containing various sequence gaps), this model may be only partial and could be significantly improved as updated versions of the assembly become available.
###Gene_Info_Comments GLEAN3_00633 ###
This model was annotated based on multiple protein sequence alignments and a manual analysis of predicted domain structures.

The annotation of this model is supported by reciprocal blasting and the observation that the domain structure of this model is similar to that of plasminogen genes. It should be noted, however, that the protein predicted by this model resembles a partial plasminogen. The position of this model in the current assembly is inconclusive with regards to the possibility that this model may be incomplete or that it may represent a novel protein related in structure to plasminogen.
###Gene_Info_Comments GLEAN3_04292 ###
This gene is split between two scaffolds.  This model encodes the N-terminal portion of the protein.  The C-terminal portion, with some overlap, is found in Glean3_25615.
###Gene_Info_Comments GLEAN3_25615 ###
This model encodes the C-terminal portion of the Rbl-1 protein.  The N-terminal portion (with some overlap) is found in Glean3_04292.
###Gene_Info_Comments GLEAN3_04011 ###
SRCR(3). Probably partial. (DMBT1)
###Gene_Info_Comments GLEAN3_14731 ###
This model may be duplicated in GLEAN3_14730. Please refer to GLEAN3_14730 for details and comments.
###Gene_Info_Comments GLEAN3_04086 ###
SRCR(4). Probably partial. (DMBT1)
###Gene_Info_Comments GLEAN3_04100 ###
SigPep-SRCR(2)-TM. Possibly part of gene that includes GLEAN3_101(and 102?). (Brain Ser prot/hensin/DMBT1)
###Gene_Info_Comments GLEAN3_06242 ###
Gene correctly predicted
###Gene_Info_Comments GLEAN3_04101 ###
SigPep-SRCR(7)-TM. Possibly part of gene that includes GLEAN3_100(and 102?). (DMBT1)
###Gene_Info_Comments GLEAN3_14730 ###
This model was annotated based on multiple protein sequence alignments and a manual analysis of predicted domain structures.

The annotation of this model is supported by reciprocal blasting and the observation that the domain structure of this model is similar to that of plasminogen genes. It should be noted, however, that given the position of this model in the current assembly (close to the end of a scaffold), this model may be only partial and could be significantly improved as updated versions of the assembly become available.

It should also be noted that this model may be duplicated in an adjacent model (GLEAN3_14731). They're 98% identical at the protein level and they are located on separate contigs, which might indicate a very recent true gen duplication event or an assembly problem (haplotypes?).
###Gene_Info_Comments GLEAN3_04160 ###
SRCR(8). possible partial. (DMBT1)
###Gene_Info_Comments GLEAN3_02088 ###
GLEAN3_13821 appears to be identical.
###Gene_Info_Comments GLEAN3_04642 ###
SRCR(2)-Sushi(1?).  Possibly partial. (DMBT1)
###Gene_Info_Comments GLEAN3_05000 ###
SigPep-SRCR(5).  Possibly partial.  (DMBT1)
###Gene_Info_Comments GLEAN3_05154 ###
SRCR(7). Possibly partial.
###Gene_Info_Comments GLEAN3_05420 ###
SRCR(5)-Sushi(1)-TM.  Possibly partial.
###Gene_Info_Comments GLEAN3_05464 ###
SRCR(6)-TM. Possibly partial.
###Gene_Info_Comments GLEAN3_05414 ###
This model was annotated based on multiple protein sequence alignments and a manual analysis of predicted domain structures.

The annotation of this model is supported by reciprocal blasting and the observation that the domain structure of this model is similar to that of plasminogen genes. It should be noted, however, that given the position of this model in the current assembly (close to an end of a small scaffold), this model may be only partial and could be significantly improved as updated versions of the assembly become available.
###Gene_Info_Comments GLEAN3_06608 ###
Partial sequence
GLEAN3_06608 (Scaffold109591) probably contains the N-ter of Rh50, C-ter to be found on GLEAN3_10354 (minus strand of Scaffold90906): Or else these are two different genes.

Probably only the first two exons code for Rh50 protein
Third exon (pos.18813-18900) probably not real:deleted


###Gene_Info_Comments GLEAN3_05556 ###
SRCR(9). Possibly partial.
###Gene_Info_Comments GLEAN3_27802 ###
PREDICTED: Strongylocentrotus purpuratus similar to Ubiquitin ligase protein RNF8 (RING finger protein 8) (LOC583203), mRNA

###Gene_Info_Comments GLEAN3_16506 ###
Model inaccurately predicts N-terminus (>100 amino acids have been omitted!)


GLEAN3_21385 appears to be identical.
###Gene_Info_Comments GLEAN3_05860 ###
SRCR(3)-TM-PTPc. Unique domain structure.  Possibly partial.
###Gene_Info_Comments GLEAN3_09188 ###
Pfam PF00909
###Gene_Info_Comments GLEAN3_18141 ###
Pfam PF00909
###Gene_Info_Comments GLEAN3_00856 ###
PREDICTED: Strongylocentrotus purpuratus similar to potassium voltage gated channel, Shab-related subfamily, member 2 (LOC593326), mRNA Length=3318

###Gene_Info_Comments GLEAN3_13823 ###
One EST (CD295368) appears to include the N-terminus of the protein and suggests the true start methionine is slightly upstream of the predicted start methionine in the GLEAN model. Based on this EST the model has been modified to include these additional amino acids at the N-terminus (MFCFRAILVLSACVVYGQKKEKTNVFTIKPVSSIYLPHYVAGKKSWGINKDAAVKTAYD...) Note that this added sequence contains a predicted signal sequence (a feature of other MSP130 proteins).

GLEAN3_06387 appears to be identical over much of its sequence but contains additional proline/glutamine-rich repeats.
###Gene_Info_Comments GLEAN3_01726 ###
Partial sequence
chimera: MelB N-ter + Amt3 C-ter

MelB N-ter is given by:
>GLEAN3_01726|Scaffold620|154587|154719| DNA_SRC: Scaffold620 START: 154587 STOP: 154719 STRAND: - 
>GLEAN3_01726|Scaffold620|155996|156169| DNA_SRC: Scaffold620 START: 155996 STOP: 156169 STRAND: - 
>GLEAN3_01726|Scaffold620|157562|157827| DNA_SRC: Scaffold620 START: 157562 STOP: 157827 STRAND: - 
>GLEAN3_01726|Scaffold620|161305|162403| DNA_SRC: Scaffold620 START: 161305 STOP: 162403 STRAND: -
###Gene_Info_Comments GLEAN3_06386 ###
BLAST identities confined to RefSeq only. Name assigned by phylogenetic methods (ME tree with ML branch lengths or heuristic ML). Percent ID to " cytochrome P450, family 2, subfamily U, polypeptide 1 [Homo sapiens]" (NP_898898.1) is 30.71% over 368 BLAST alignment positions. 759 of 1189 Muscle alignment positions masked (63.800 %; 430 positions used for tree generation) with a Muscle scorefile cutoff of 25.
###Gene_Info_Comments GLEAN3_11265 ###
Partial sequence: N-ter only

CDS at the end of scaffold

Pfam PF00909
###Gene_Info_Comments GLEAN3_26576 ###
The sequence coded by this GLEAN is entirely contained in GLEAN3_15285; refer to this one for further annotation
In GLEAN3_26576, exon 1 and part of exon 2 are missing.
###Gene_Info_Comments GLEAN3_14492 ###
Very similar at both the amino acid and nucleotide levels to 3 other GLEAN models: 21242, 15326, and 12567. It seems likely that at least some of these are haplotypes or improperly assembled genes. 
###Gene_Info_Comments GLEAN3_05008 ###
Partial CDS based on alignment with best blast hit sequence.  Transcriptome data strongly suggests that the following exons belong to this gene.  The first of the following predicted exons blasts to the same metalloprotease.
>Supertig106616_1|Scaffold106616|20708|20789| DNA_SRC: Scaffold106616 START: 20708 STOP: 20789 STRAND: + 
GGCTATTGAATTTACGTGGAGAAGATATTCCATACAACCCTCTCTTTATATCGTTTGCTATCGTCGGGAC
CCAGTACATCAA
>Supertig106616_1|Scaffold106616|21416|21513| DNA_SRC: Scaffold106616 START: 21416 STOP: 21513 STRAND: + 
CCTATACCTCTATGATGCTAGCCGCCGGTTGGACACGGAGAGATATTCGGACTTGAGAGAACATTTGGAG
CTGGACTCGTCTAGGTGCTCAAGCTACA
>Supertig106616_1|Scaffold106616|23148|23305| DNA_SRC: Scaffold106616 START: 23148 STOP: 23305 STRAND: + 
CTTCGTTCCCGACTACTTGCTTGAGAGCAAGGGATTTATCAGCCTTCACCGATGATCTCAGTACCGTTTC
GTTTAGTAAGAAGGTCTGGTTTAGTAACTCATCTAACTACTTCGTCTACAAAACTATCACTGAGAGGCCT
GATGTCAACAATGATAGG
>Supertig106616_1|Scaffold106616|24152|24238| DNA_SRC: Scaffold106616 START: 24152 STOP: 24238 STRAND: + 
AGTAAATATATCATGGAGCCCTCTCCCGTCCTGCTCATGAAAGCTGTCAAAAATGACGTCGAAGTGGAGG
CGATGAACCAGGCATTC
>Supertig106616_1|Scaffold106616|25692|25762| DNA_SRC: Scaffold106616 START: 25692 STOP: 25762 STRAND: + 
GATCCAAAAGAAGGAGACGATCGATCGTTGACGGAATGGCTGGTTGCTCAGAAGACTGAAACATTCAGAG
A
>Supertig106616_1|Scaffold106616|26458|26535| DNA_SRC: Scaffold106616 START: 26458 STOP: 26535 STRAND: + 
ATCACATAGCAGTTATCAATACCCGAGTTACGAGACGATAGCCGCCGTAGGATACCACAGTGCCGACTAT
TATTACCA
>Supertig106616_1|Scaffold106616|27073|27144| DNA_SRC: Scaffold106616 START: 27073 STOP: 27144 STRAND: + 
CCCTATAGAGGATGACCGGTTTGCCATACCTACTGGTAAGATGTTCCTCTATGACATGGGAGGACAGTAT
AG
>Supertig106616_1|Scaffold106616|27693|27813| DNA_SRC: Scaffold106616 START: 27693 STOP: 27813 STRAND: + 
AGAAGGGACGACTACCCTCGCCCGAACCTTCTTCTTTGCCAAGGAATGGTATGAGGAGAATGAGAACCGT
TATGAGTTTGATCGCACTTATGATCCTGCAAGGCCAACTGAATTTCAGCAG
>Supertig106616_1|Scaffold106616|27868|27960| DNA_SRC: Scaffold106616 START: 27868 STOP: 27960 STRAND: + 
GCTCTGACGCTCATGGCGCTTTACATATTTTTACCGTCACTGGGCACTGGTCATAGCCGAGCTGCCAAAG
TAACTAGCGCTCAAGCATATCAG
>Supertig106616_1|Scaffold106616|28493|28607| DNA_SRC: Scaffold106616 START: 28493 STOP: 28607 STRAND: + 
TTTTCAGTGGACGAGGAGGGGGTGGCTAGTTGGGTTGGTCAACTTGACCGCTTTGGACGTCGTCGGTCGT
TTATCGAAGTGATATATCGCGAGGATAAGGGAACCAATGTAGGGG
		




###Gene_Info_Comments GLEAN3_07203 ###
Analysis indicates typical cysteines positioned for post-translational processing, protein folding and disulphide bonding in the mature peptide.An additional cysteine residue in a B domain is similar to the sequence of Ciona INS-L3. The B domain is very long in Sp-IGF1. Two dibasic sites in the sequence corresponding to a short C-peptide makes it vertebrate insulin- and relaxin-like. All true IGFs in vertebrates have lost dibasic sites in C-peptide.They are cleaved at the far end of the long C terminus, in an E domain. Two aromatic residues (YY) in the end of the putative B domain are crucial for biological activity of IGFs.
Usually the D domain is short in length. Dibasic residues (RASR) at the beginning of the long E domain mark the C-terminal of D domain in IGFs. This makes the D domain of Sp-IGF 1 very long in contrast with other IGF sequences.
Annotated with the help of Robert Olinski (Robert.Olinski@neuro.uu.se) and Mohammed Idris (Idris@szn.it) 

###Gene_Info_Comments GLEAN3_04690 ###
Nek11 is split on two overlapping glean models with 100%identity (exon 3 and 4). glean3_04690 and glean3_14659 
###Gene_Info_Comments GLEAN3_14659 ###
Nek11 is split on two overlapping glean models with 100%identity. glean3_04690 and glean3_14659
see glean3_04690 for gene model
###Gene_Info_Comments GLEAN3_15736 ###
Probably has an extra exon predicted towards beginning.
###Gene_Info_Comments GLEAN3_00811 ###
FIRST 170 AA corresponds to RAB3.  The rest-contains lots of repeated sequences - probably artifact
###Gene_Info_Comments GLEAN3_15145 ###
Missing an exon at the beginning?
###Gene_Info_Comments GLEAN3_00253 ###
contains rab domain
No signal on tiling array and no EST.  May be pseudogene or expressed in adult only.
###Gene_Info_Comments GLEAN3_23334 ###
GLEAN3_16582 and GLEAN3_17442 have near exact match to c-terminal portions of the corrected gene model.  Exon 2 is present in the scaffold at 2 different positions. 

Query: 34   GISMQGMDPSHVALTVLMLQNDLFDQFRCDRNVTLGLNHAT 74
            GISMQGMDPSHVALTVLMLQNDLFDQFRCDRNVTLGLNHAT
Sbjct: 5190 GISMQGMDPSHVALTVLMLQNDLFDQFRCDRNVTLGLNHAT 5068

and

Query: 34   GISMQGMDPSHVALTVLMLQNDLFDQFRCDRNVTLGLNHAT 74
            GISMQGMDPSHVALTVLMLQNDLFDQFRCDRNVTLGLNHAT
Sbjct: 8917 GISMQGMDPSHVALTVLMLQNDLFDQFRCDRNVTLGLNHAT 8795

Exon 6 is out of the scaffold on separate small scaffold as listed in gene features


###Gene_Info_Comments GLEAN3_00503 ###
contians rab domain
###Gene_Info_Comments GLEAN3_01288 ###
missing n-term
###Gene_Info_Comments GLEAN3_13132 ###
GLEAN3_03272 is a partially similar prediction.
###Gene_Info_Comments GLEAN3_03272 ###
GLEAN3_13132 is partially overlapping similar prediction.
###Gene_Info_Comments GLEAN3_01818 ###
appears to be partial
###Gene_Info_Comments GLEAN3_22309 ###
Part of sequence of this gene model is also found on another scaffold in GLEAN3_01872 (sequence almost identical for part; 3' end missing).  Is this a duplication or part of the genome that has been included in the assembly twice?
###Gene_Info_Comments GLEAN3_06045 ###
SRCR(2). Probably incomplete.
###Gene_Info_Comments GLEAN3_06055 ###
SigPep-SRCR(4)-TM.
###Gene_Info_Comments GLEAN3_06254 ###
SRCR(4)-TM. Possibly incomplete.
###Gene_Info_Comments GLEAN3_06264 ###
SRCR(4). Probably incomplete.
###Gene_Info_Comments GLEAN3_06531 ###
SRCR(4). Probably incomplete.
###Gene_Info_Comments GLEAN3_02507 ###
rab and DNAj domain
###Gene_Info_Comments GLEAN3_06538 ###
SRCR(2). Probably incomplete. Check in reference to GLEAN3_06539.
###Gene_Info_Comments GLEAN3_06539 ###
SigPep-SRCR(12). Probably incomplete. Check GLEAN3_06538.
###Gene_Info_Comments GLEAN3_21493 ###
AA 193-340 in prediction does not match human homologue; in fact matches nothing (except for another predicted S. purp sequence [XP_791406])when blasted by itself.  AA 497-682 of prediction matches a P. liv EST in the MaxPlank database.
###Gene_Info_Comments GLEAN3_06659 ###
SRCR(6)-LRR(3). Possiby incomplete. Unique domain orgaization.
###Gene_Info_Comments GLEAN3_06731 ###
SigPep-SRCR(6)-TM.
###Gene_Info_Comments GLEAN3_07110 ###
SRCR(7). Probably incomplete.
###Gene_Info_Comments GLEAN3_21121 ###
Overlap with Glean3_19174. This model appears to encode the C-terminal part of SpCul-3.
###Gene_Info_Comments GLEAN3_19174 ###
Overlap with Glean3_21121.  This model appears to encode the N-terminus of SpCul-3.
###Gene_Info_Comments GLEAN3_07349 ###
WSC(2)-SRCR-WSC-SRCR-WSC-TM. Possibly incomplete. Unique domain composition.
###Gene_Info_Comments GLEAN3_01926 ###
Encodes sequences in middle of protein, corresponding to amino acids 447-659 of human cullin 4A.
###Gene_Info_Comments GLEAN3_28294 ###
This protein contains the following domains:
DEATH,NACHT and LRRs
###Gene_Info_Comments GLEAN3_09731 ###
Appears to encode C-terminal portion of the protein, corresponding to amino acids downstream of residue number 704 of human cullin 4.
###Gene_Info_Comments GLEAN3_18555 ###
Appears to encode the N-terminal portion of the protein, corresponding to amino acids 27-346 of human cullin 4.  Glean3_18556 also appears to have portions of this gene, assembled with parts of various other genes including PEX1.
###Gene_Info_Comments GLEAN3_07370 ###
SigPep-SRCR(2)-HYR. Possibly incomplete.
###Gene_Info_Comments GLEAN3_00832 ###
Previously cloned.  GLEAN prediction contains GAPS that should be rectified.  
###Gene_Info_Comments GLEAN3_07372 ###
SRCR(2)-HYR(2)-IgC2-GPS-SRCR(2). Probably partial. See GLEAN3_07370.
###Gene_Info_Comments GLEAN3_07618 ###
SigPep-SRCR(2)-TM.
###Gene_Info_Comments GLEAN3_07660 ###
SRCR(3)-TM. Possibly partial.
###Gene_Info_Comments GLEAN3_07718 ###
SRCR(6)-TM. Possibly partial.
###Gene_Info_Comments GLEAN3_01439 ###
GLEAN3_17618 was another high scoring blast hit for POLR2C in the glean database.
###Gene_Info_Comments GLEAN3_07781 ###
SRCR(10)-TM. Possibly partial. See GLEAN3_07782.
###Gene_Info_Comments GLEAN3_11776 ###
This gene contains 2 NACHT domains which is very unusual. Also, it is located at the end of a Scaffold and could be incomplete.
Domains: DEATH,NACHT,LRR,NACHT,LRRs
###Gene_Info_Comments GLEAN3_13170 ###
The protein encoded by this model appears to be a duplication of the 5' and 3' halves.  The sequence shows two thioester sites, which is unheard of, and no cleavage sites for alpha and beta chains.  
###Gene_Info_Comments GLEAN3_13169 ###
GLEAN3_13169 appears to be a duplication of the 5' end of GLEAN3_13170.
###Gene_Info_Comments GLEAN3_07782 ###
SRCR(5). Probably partial.  See GLEAN3_07781.
###Gene_Info_Comments GLEAN3_07893 ###
SRCR(6). Probably partial. See GLEAN3_07894, 07895, 07896, 07897, 07899, 07900.
###Gene_Info_Comments GLEAN3_07840 ###
cub(4)-SRCR(2). Possibly partial.
###Gene_Info_Comments GLEAN3_07894 ###
SRCR(4). Probably partial. See GLEAN3_07893, 07895, 07896, 07897, 07899, 07900.
###Gene_Info_Comments GLEAN3_07895 ###
SRCR(2). Probably partial. See GLEAN3_07893, 07894, 07896, 07897, 07899, 07900.
###Gene_Info_Comments GLEAN3_07896 ###
SRCR(7)-TM. Probably partial. See GLEAN3_07893, 07894, 07895, 07897, 07899, 07900.
###Gene_Info_Comments GLEAN3_07897 ###
SRCR(2)-TM. Probably partial. See GLEAN3_07893, 07894, 07895, 07896, 07899, 07900.
###Gene_Info_Comments GLEAN3_07899 ###
SigPep-SRCR(5)-TM. Probably partial. See GLEAN3_07893, 07894, 07895, 07896, 07897, 07900.
###Gene_Info_Comments GLEAN3_07900 ###
SRCR(4)-TM. Probably partial. See GLEAN3_07893, 07894, 07895, 07896, 07897, 07899.
###Gene_Info_Comments GLEAN3_12678 ###
This GLEAN is missing the N-terminal amino acid sequence of an alpha-tubulin, and is adjacent to another alpha-tubulin Gene Model, GLEAN3_12679.
###Gene_Info_Comments GLEAN3_10277 ###
Only exons 3-22 are present on this GLEAN model on scaffold464. There is a gap of ~100bp where exons 15 (assuming that this part represnts only one exon) could be located. Exons 1-2 are present on scaffold35149, which has no GLEAN prediction.
###Gene_Info_Comments GLEAN3_11107 ###
matches to the 3' end of GLEAN3_11106
###Gene_Info_Comments GLEAN3_06756 ###
This GLEAN contains an insertion at its amino terminus that is inconsistent with assignment as a conventional alpha-tubulin.
###Gene_Info_Comments GLEAN3_07984 ###
This GLEAN contains an insertion at its amino terminus that is inconsistent with assignment as a conventional alpha-tubulin.  There is a "T" missing, probably from sequencing error.
###Gene_Info_Comments GLEAN3_27579 ###
This GLEAN has a good full-length match with Chlamydomonas reinhardtii RIB43A (E = 2.00E-42).
###Gene_Info_Comments GLEAN3_25241 ###
One of 2. GLEAN3_22153 is a non-identical duplicate
###Gene_Info_Comments GLEAN3_08432 ###
SRCR(4). Probably incomplete. 
###Gene_Info_Comments GLEAN3_28690 ###
This gene model has one thrombospondin domain and does not show high scoring matches to other sequences on GenGank.  
###Gene_Info_Comments GLEAN3_08504 ###
SRCR(16). Probably incomplete.
###Gene_Info_Comments GLEAN3_08514 ###
SRCR(2)-TM. Possibly incomplete.
###Gene_Info_Comments GLEAN3_08598 ###
SRCR(2). Probably incomplete. 
###Gene_Info_Comments GLEAN3_08642 ###
SigPep-SRCR(8). Possibly incomplete.
###Gene_Info_Comments GLEAN3_11197 ###
This model was annotated based on a manual inspection of multiple protein sequence alignments and domain structure analyses.

This annotation is supported by reciprocal blasting to Inhibitor of NFkappaB genes from various taxa and an identical domain structure. Its structure also strongly correlates with genome-wide tiling array hybridization data.
###Gene_Info_Comments GLEAN3_08836 ###
SigPep-SRCR(2). Possibly incomplete.
###Gene_Info_Comments GLEAN3_08885 ###
SRCR(3). probably incomplete.
###Gene_Info_Comments GLEAN3_09145 ###
SRCR(2). Probably incomplete.
###Gene_Info_Comments GLEAN3_05782 ###
contains rab, ras and trunk domains
###Gene_Info_Comments GLEAN3_06362 ###
PARTIAL--MISSING N-TERMINUS
No signal in tiling array or EST.  May be pseudogene or adult only expression.
###Gene_Info_Comments GLEAN3_24635 ###
 partial, missing C-terminus
###Gene_Info_Comments GLEAN3_27868 ###
 missing central stretch
###Gene_Info_Comments GLEAN3_28499 ###
 extra N- and C-terminus
###Gene_Info_Comments GLEAN3_04006 ###
One of 2. GLEAN_21174 is a shorter internal version of this GLEAN. 21174 is mostly identical (protein level), but does not match on either predicted end. 
###Gene_Info_Comments GLEAN3_08777 ###
This GLEAN shares ~50% sequence identity over nearly its entire length with NP_653306 (Homo sapiens tektin-1).  Other than a discrepancy of a 39-amino acid insertion at the amino terminus of GLEAN3_19591, the coding regions of GLEAN3_08777 and GLEAN3_19591 are identical.
###Gene_Info_Comments GLEAN3_19591 ###
This GLEAN shares ~50% sequence identity over nearly its entire length with NP_653306 (Homo sapiens tektin-1).  But the first 39 predicted amino acids after the initiator M (DAGATLLSRSYAPTIPVYPTQTTVGTKTDQALSQDLAKM) look like they don't belong.  Other than this discrepancy, the coding regions of GLEAN3_08777 and GLEAN3_19591 are identical.
###Gene_Info_Comments GLEAN3_23618 ###
This GLEAN shares >50% sequence identity over nearly its entire length with NP_114104 (Homo sapiens tektin-3).
###Gene_Info_Comments GLEAN3_06388 ###
IDENTICAL TO GLEAN3_06389 except at 3'end
###Gene_Info_Comments GLEAN3_06453 ###
This GLEAN shares ~50% sequence identity over nearly its entire length with NP_444515.1 (Homo sapiens tektin-1).  The coding regions of GLEAN3_13841 and GLEAN3_06453 are identical.
###Gene_Info_Comments GLEAN3_06392 ###
contains ADP-ribosyl-GH domain and rab domain
###Gene_Info_Comments GLEAN3_20728 ###
This GLEAN shares ~52% sequence identity over ~400 amino acids with NP_055281 (Homo sapiens tektin-2).
###Gene_Info_Comments GLEAN3_00049 ###
 missing N-terminus
###Gene_Info_Comments GLEAN3_07237 ###
 fragment
###Gene_Info_Comments GLEAN3_21063 ###
 half-molecule
###Gene_Info_Comments GLEAN3_05461 ###
 fragment
###Gene_Info_Comments GLEAN3_07285 ###
contains endo/exonuclease domains and phosphatase domains and ras/rho domain
###Gene_Info_Comments GLEAN3_04038 ###
This is a short fragment that appears to encode a protein identical to the Sp-betaL Integrin. It is an incomplete sequence at the end of a short scaffold.  
###Gene_Info_Comments GLEAN3_00348 ###
 fragment, should join with GLEAN3_00348, still incomplete gene
###Gene_Info_Comments GLEAN3_00349 ###
 fragment, should join with GLEAN3_00348, still incomplete gene
###Gene_Info_Comments GLEAN3_21469 ###
 extra N-terminus half
###Gene_Info_Comments GLEAN3_24384 ###
 extra C-terminus half
###Gene_Info_Comments GLEAN3_01041 ###
 contains 2 repeats matching a similar stretch
###Gene_Info_Comments GLEAN3_03580 ###
 extra N- and C-terminus stretches
###Gene_Info_Comments GLEAN3_04561 ###
 extra N-terminus region
###Gene_Info_Comments GLEAN3_04242 ###
 unrelated stretch on N-terminus, partial match to the gene
###Gene_Info_Comments GLEAN3_07951 ###
 fragment
###Gene_Info_Comments GLEAN3_14894 ###
 partial, missing C-terminus half, some extra stretches in middle
###Gene_Info_Comments GLEAN3_16354 ###
 unrelated N-terminus half, only the C-terminus half matches the gene
###Gene_Info_Comments GLEAN3_14094 ###
 fragment
###Gene_Info_Comments GLEAN3_11806 ###
 fragment
###Gene_Info_Comments GLEAN3_11846 ###
 fragment
###Gene_Info_Comments GLEAN3_11894 ###
 fragment
###Gene_Info_Comments GLEAN3_11897 ###
 fragment
###Gene_Info_Comments GLEAN3_12522 ###
 fragment
###Gene_Info_Comments GLEAN3_13365 ###
 fragment
###Gene_Info_Comments GLEAN3_13505 ###
 fragment
###Gene_Info_Comments GLEAN3_13566 ###
 partial, missing N-terminus
###Gene_Info_Comments GLEAN3_14209 ###
 fragment
###Gene_Info_Comments GLEAN3_14873 ###
 fragment
###Gene_Info_Comments GLEAN3_14903 ###
 fragment
###Gene_Info_Comments GLEAN3_15011 ###
 fragment
###Gene_Info_Comments GLEAN3_15530 ###
 fragment
###Gene_Info_Comments GLEAN3_16239 ###
 fragment
###Gene_Info_Comments GLEAN3_16280 ###
 fragment
###Gene_Info_Comments GLEAN3_16701 ###
 fragment
###Gene_Info_Comments GLEAN3_17893 ###
 extra N-terminus
###Gene_Info_Comments GLEAN3_18043 ###
 fragment
###Gene_Info_Comments GLEAN3_18288 ###
 fragment
###Gene_Info_Comments GLEAN3_19705 ###
 fragment
###Gene_Info_Comments GLEAN3_19912 ###
 fragment
###Gene_Info_Comments GLEAN3_08410 ###
IDENTICAL TO 15580
###Gene_Info_Comments GLEAN3_14036 ###
This gene is on three scaffolds (544, 81593 and 56300). On scaffold544 (exon 1) there is one GLEAN model (GLEAN3_25514) which covers exon1-3 for this gene. On scaffold81593 GLEAN_14036 is predicted (exon 2-15). For scaffold 56300, gene prediction is one GLEAN3_17882(exon 8-24) for this gene. Exon 2 and 3 are overlapped between two scaffolds (544 and 81593) and exon 8-15 are overlapped between scaffolds 81593 and 56300. Please refer to GLEAN3_25514 and GLEAN_17882 for gene features for far N-terminal and C-terminl portion respectively.
###Gene_Info_Comments GLEAN3_17882 ###
This gene is on three scaffolds (544, 81593 and 56300). On scaffold544 there is GLEAN model (GLEAN3_25514) which covers exon1-3 for this gene.. On scaffold81593 GLEAN_14036 is predicted (exon 2-15). For scaffold 56300, gene prediction is GLEAN3_17882(exon 8-24) for this gene. Exon 2 and 3 are overlapped between two scaffolds (544 and 81593) nad exon 8-15 are overlapped between scaffolds 81593 and 56300. Please refer to GLEAN_14036 and GLEAN3_25514 for gene features for N-terminl portion.
###Gene_Info_Comments GLEAN3_09123 ###
poor conservation 
###Gene_Info_Comments GLEAN3_25502 ###
Ig7/FN5/TM

best hit is Ds-CAM but the domain organization is not quite right for either Ds-CAM (9-4-1-2) or DCC (4-6)
Does not have Neogenin_C cytoplasmic match

probably neither but some related gene
###Gene_Info_Comments GLEAN3_04328 ###
Igc2-4/FN3-5/TM

High Blast hit (not #1) with DCC
C-terminus (putative cyto domain) does not have Neogenin_C
and is not homologous with DCC in Blast

Domain organisation not consisitent with Ds-CAM or DCC (4-6)
###Gene_Info_Comments GLEAN3_09220 ###
SigPep-SRCR-WSC-CUB. Possiby incomplete.
###Gene_Info_Comments GLEAN3_09354 ###
SRCR(4). Possibly partial.
###Gene_Info_Comments GLEAN3_09496 ###
SigPep-SRCR(6). Possibly incomplete.
###Gene_Info_Comments GLEAN3_09562 ###
SRCR(2). Probably partial.
###Gene_Info_Comments GLEAN3_09676 ###
SRCR(3). Probably incomplete. See GLEAN3_09677.
###Gene_Info_Comments GLEAN3_05039 ###
has GPS and a single TM - probably missing C-terminus.
five LDLa and one EGF in exodomain - there are no reported LDLa-LNB7TM GPCRs but St purp has several possible members

Novel domain structure 
###Gene_Info_Comments GLEAN3_09677 ###
SRCR(3). Probably incomplete. See GLEAN3_09676.
###Gene_Info_Comments GLEAN3_09753 ###
SRCR(3). Probably partial.
###Gene_Info_Comments GLEAN3_05758 ###
has GPS and four TM - probably missing C-terminus.
three LDLa in exodomain - there are no reported LDLa-LNB7TM GPCRs but St purp has several

Novel domain structure 
###Gene_Info_Comments GLEAN3_09869 ###
contains Rab, Arf-GAP, and Ank domains
###Gene_Info_Comments GLEAN3_19158 ###
NO GPS but  7TM-1 
five LDLa in exodomain - there are no reported LDLa-LNB7TM GPCRs but St purp has several of this type

No GPS - looks most like glycoprotein hormone receptors

###Gene_Info_Comments GLEAN3_09988 ###
Ig(2)-SRCR(6). Unique structure. Probably incomplete. See GLEAN3_09989
###Gene_Info_Comments GLEAN3_26060 ###
has GPS but no TM - probably missing C-terminus.
two LDLa and two EGF in exodomain - there are no reported LDLa-LNB7TM GPCRs but St purp has several

Novel domain structure 
###Gene_Info_Comments GLEAN3_09910 ###
PARTIAL, MISSING N-TERMINUS
###Gene_Info_Comments GLEAN3_09989 ###
SRCR(6). Probably incomplete. See GLEAN3_09988.
###Gene_Info_Comments GLEAN3_10001 ###
SRCR(2). Probably incomplete.
###Gene_Info_Comments GLEAN3_10062 ###
SRCR(6). Possibly incomplete.
###Gene_Info_Comments GLEAN3_10226 ###
SRCR(4). Possibly incomplete. See GLAEN3_10227.
###Gene_Info_Comments GLEAN3_00002 ###
CUB-LDLa x2 7TM

No GPS but otherwise looks like a member of the LNB-7TM family of adhesion domain GPCRs
No known LDLa members of LNB7TM GPCR family
Novel architecture
###Gene_Info_Comments GLEAN3_10227 ###
F5_F8_type_C-SRCR(9). Probably incomplete. See GLEAN3_10226.
###Gene_Info_Comments GLEAN3_23577 ###
FA58C-CUB-CLECT-LDLa x4-LRR x3 - 7TM

No GPS but otherwise looks like a member of the LNB-7TM family of adhesion domain GPCRs
No known FA58C or LDLa members of LNB7TM GPCR family
Novel architecture
###Gene_Info_Comments GLEAN3_10232 ###
SigPep-SRCR(4). Possibly incomplete. 
###Gene_Info_Comments GLEAN3_27136 ###
LDLa-LRRNT-LRRtypx3 - 7TM_1

No GPS - looks most like glycoprotein hormone receptors

###Gene_Info_Comments GLEAN3_10240 ###
SRCR(2). Possibly incomplete. See GLEAN3_10241.
###Gene_Info_Comments GLEAN3_10241 ###
SRCR(2). Possibly incomplete. See GLEAN3_10240
###Gene_Info_Comments GLEAN3_14777 ###
LDLa x5-EGFx2-Igc2-GPS - 7tm_2
Looks like a member of the LNB-7TM family of adhesion domain GPCRs
No known LDLa members of LNB7TM GPCR family
Novel architecture
###Gene_Info_Comments GLEAN3_11062 ###
CUB - LDLa - 7tm_1
No GPS and 7tm_1 - could be a meber of the LNB-7TM subfamily - more likely a member of glycoprotein hormane receptor family.
No known LDLa members of LNB7TM GPCR family

###Gene_Info_Comments GLEAN3_10330 ###
SigPep-SRCR(2)-HYR
###Gene_Info_Comments GLEAN3_20284 ###
identical to parts of GLEAN3_27935
###Gene_Info_Comments GLEAN3_10409 ###
SRCR(5)
###Gene_Info_Comments GLEAN3_10501 ###
SigPep-SRCR(4). Possibly incomplete.
###Gene_Info_Comments GLEAN3_22714 ###
LDLa x9-EGFCa x4-Ig 7TM_2

No GPS but otherwise looks like a member of the LNB-7TM family of adhesion domain GPCRs

No known LDLa members of LNB7TM GPCR family
Novel architecture
###Gene_Info_Comments GLEAN3_10523 ###
SRCR(2). Possibly incomplete.
###Gene_Info_Comments GLEAN3_05239 ###
LDLa x10-LRRtyp x5 7TM


No GPS - looks most like glycoprotein hormone receptors


###Gene_Info_Comments GLEAN3_10832 ###
SRCR(3)-TM. Possibly incomplete.
###Gene_Info_Comments GLEAN3_10909 ###
SRCR(3)-TM. Possibly incomplete.
###Gene_Info_Comments GLEAN3_04837 ###
LDLa x2-LRRtyp x5 7TM

No GPS - looks most like glycoprotein hormone receptors

No known LDLa members of LNB7TM GPCR family

###Gene_Info_Comments GLEAN3_10953 ###
SRCR(5). Possibly incomplete.
###Gene_Info_Comments GLEAN3_09242 ###
CUBx5-FA58C-CUB-LDLa x3 5TM

No GPS but otherwise looks like a member of the LNB-7TM family of adhesion domain GPCRs

No known FA58C or LDLa members of LNB7TM GPCR family
Novel architecture
###Gene_Info_Comments GLEAN3_10991 ###
SRCR(6)-TM. Possibly incomplete. See GLEAN3_10992. 10993, 10994.
###Gene_Info_Comments GLEAN3_05132 ###
CUB-CLECT-LDLa-EGF-LDLa x3 7TM

No GPS but otherwise looks like a member of the LNB-7TM family of adhesion domain GPCRs

No known LDLa members of LNB7TM GPCR family
Novel architecture
###Gene_Info_Comments GLEAN3_10992 ###
SigPep-SRCR(2)-TM. Possibly incomplete. See GLEAN3_10991. 10993, 10994.
###Gene_Info_Comments GLEAN3_10993 ###
SigPep-SRCR(5). Possibly incomplete. See GLEAN3_10991. 10992, 10994.
###Gene_Info_Comments GLEAN3_12382 ###
CLECT-LDLa x6-LRRNT-LRRtyp x4 7TM-1

No GPS and 7tm-1 rather than 7tm_2 but otherwise looks like a member of the LNB-7TM family of adhesion domain GPCRs

No known LDLa members of LNB7TM GPCR family
Novel architecture
###Gene_Info_Comments GLEAN3_10994 ###
SRCR(2)-TM. Possibly incomplete. See GLEAN3_10991, 10992, 10993.
###Gene_Info_Comments GLEAN3_07191 ###
fragment; see Glean3_07191 for larger Sp-Trh
###Gene_Info_Comments GLEAN3_15872 ###
CUB-LDLa x2-LRRtyp x2 7TM_1

No GPS but otherwise looks a bit like a member of the LNB-7TM family of adhesion domain GPCRs or, perhaps more likely, a glycoproteinhormone receptor

No known LDLa members of LNB7TM GPCR family
Novel architecture
###Gene_Info_Comments GLEAN3_11101 ###
SigPep-SRCR(2). Possibly incomplete.
###Gene_Info_Comments GLEAN3_11146 ###
SRCR(3). Possibly incomplete.
###Gene_Info_Comments GLEAN3_15161 ###
LDLa-LRRtyp x5 7TM

No GPS - looks most like glycoprotein hormone receptors

###Gene_Info_Comments GLEAN3_28049 ###
CUB-CLECT-LDLa x5/6-LRRtyp x 7TM_1


No GPS - looks most like glycoprotein hormone receptors
###Gene_Info_Comments GLEAN3_08926 ###
LDLa-Ig-GPS-7TM
Looks like a member of the LNB-7TM family of adhesion domain GPCRs

No known LDLa members of LNB7TM GPCR family
Novel architecture
###Gene_Info_Comments GLEAN3_11222 ###
SigPep-SRCR(13)-TM.
###Gene_Info_Comments GLEAN3_11752 ###
SRCR(2). Probably incomplete.
###Gene_Info_Comments GLEAN3_13248 ###
LRRtyp x3 - 7TM

No GPS - looks most like glycoprotein hormone receptors
###Gene_Info_Comments GLEAN3_11977 ###
SigPep-SRCR(4). Possibly incomplete
###Gene_Info_Comments GLEAN3_12039 ###
SigPep-SRCR(7). possibly incomplete.
###Gene_Info_Comments GLEAN3_27096 ###
SR x4 7TM

No GPS but otherwise looks a bit like a member of the LNB-7TM family of adhesion domain GPCRs
###Gene_Info_Comments GLEAN3_12159 ###
SRCR(3)-TM. Probably incomplete.
###Gene_Info_Comments GLEAN3_19239 ###
SR x4 - LRRtyp x7 - 7TM_1


No GPS - looks most like glycoprotein hormone receptors
###Gene_Info_Comments GLEAN3_12230 ###
SigPep-SRCR(4)-EGF
###Gene_Info_Comments GLEAN3_22189 ###
CUB-LRRtyp 4 - 7TM_1

No GPS - looks most like glycoprotein hormone receptors
###Gene_Info_Comments GLEAN3_12410 ###
SigPep-SRCR(6). Possibly incomplete.
###Gene_Info_Comments GLEAN3_03206 ###
LRRtyp x3 - 7TM

No GPS - looks most like glycoprotein hormone receptors

###Gene_Info_Comments GLEAN3_12888 ###
SRCR(3)-TM. Possibly incomplete.
###Gene_Info_Comments GLEAN3_07937 ###
LDLa x2 - 7TM
No GPS but otherwise looks like a member of the LNB-7TM family of adhesion domain GPCRs

###Gene_Info_Comments GLEAN3_13650 ###
SigPep-SRCR(3). possibly incomplete.
###Gene_Info_Comments GLEAN3_13831 ###
SRCR(3). probably incomplete.
###Gene_Info_Comments GLEAN3_16033 ###
LRRtyp x2 - 7TM-1

No GPS - looks most like glycoprotein hormone receptors
###Gene_Info_Comments GLEAN3_19238 ###
SR x3 - LRRtyp x7 - 7TM_1


No GPS - looks most like glycoprotein hormone receptors
###Gene_Info_Comments GLEAN3_13958 ###
SRCR(2)-TM. Possibly incomplete.
###Gene_Info_Comments GLEAN3_14079 ###
SRCR(4)-TM. probably incomplete. See GLEAN3_14080.
###Gene_Info_Comments GLEAN3_14080 ###
SigPep-SRCR(2). Probably incomplete. See GLEAN3_14079.
###Gene_Info_Comments GLEAN3_14095 ###
SigPep-SRCR(4). Possibly incomplete.
###Gene_Info_Comments GLEAN3_14769 ###
LRRtyp x10 - 7TM_1

No GPS - looks most like glycoprotein hormone receptors

NOTE GLEAN3_14765 contains a very similar gene fused with a Cathepsin gene 
###Gene_Info_Comments GLEAN3_14602 ###
SigPep-SRCR(6). Possibly incomplete.
###Gene_Info_Comments GLEAN3_14829 ###
SRCR(7). Probably incomplete.
###Gene_Info_Comments GLEAN3_12610 ###
LRRtyp x3 7TM

No GPS - looks most like glycoprotein hormone receptors
###Gene_Info_Comments GLEAN3_18295 ###
LRRtyp x2 7TM_1


No GPS - looks most like glycoprotein hormone receptors
###Gene_Info_Comments GLEAN3_08259 ###
LRRtyp x3 7TM  ShKT domain at N-terminus may be an artefact

No GPS but otherwise looks like a member of the LNB-7TM family of adhesion domain GPCRs or like glycoprotein hormane receptors
###Gene_Info_Comments GLEAN3_18294 ###
LRRtyp x2 - 7TM_1


No GPS - looks most like glycoprotein hormone receptors
###Gene_Info_Comments Sp-HIF-1a ###
This is the last 18 exons of Glean3_01262. The rest of Glean3_01262 has been annotated as a separate gene, Sp-Birc6
###Gene_Info_Comments GLEAN3_18887 ###
This is part of a sea urchin specific group of ADAM-TS metalloproteinase genes.  There are two worm ADAMTS genes with some sequence similarity (wormbase# F08C6.1a.1 and C02B4.1).  

The homeobox at the N-terminal is clearly a prediction/ assembly error.
###Gene_Info_Comments GLEAN3_10561 ###
LRRtyp x3 - 7TM-1

No GPS - looks most like glycoprotein/thyrotropin hormone receptors
###Gene_Info_Comments GLEAN3_21131 ###
 This is part of a sea urchin specific group of ADAM-TS genes.  There are two worm ADAMTS genes with some sequence similarity (wormbase# F08C6.1a.1 and C02B4.1).  This sequence also appears to be a haplotype but is missing a portion.
###Gene_Info_Comments GLEAN3_24052 ###
LRRtyp x2 - 7TM-1

No GPS - looks most like glycoprotein/thyrotropin hormone receptors
###Gene_Info_Comments GLEAN3_24290 ###
LRRtyp x2 - 7TM-1

No GPS - looks most like glycoprotein/thyrotropin hormone receptors
###Gene_Info_Comments GLEAN3_15464 ###
LRRtyp x6 - 7TM-1

No GPS - looks most like glycoprotein/thyrotropin hormone receptors
###Gene_Info_Comments GLEAN3_00745 ###
contains only part of the reprolysin domain
###Gene_Info_Comments GLEAN3_12494 ###
LRRNT/LRRtyp/LRRCT-GPS- 7TM-2

looks like a member of the LNB-7TM subfamily of GPCRs
###Gene_Info_Comments GLEAN3_03202 ###
only domain it contains is a part of the reprolysin domain
###Gene_Info_Comments GLEAN3_07355 ###
only domain it contains is the reprolysin domain
###Gene_Info_Comments GLEAN3_04726 ###
SR x4 - LRRtyp x3 - 7TM_1


No GPS - looks most like glycoprotein hormone receptors
###Gene_Info_Comments GLEAN3_06020 ###
may be part of the sea urchin only family of genes, but it contains only the metalloprotease and reprollysin domains which are the first parts of an ADAM-TS gene.   There are two worm ADAMTS genes with some sequence similarity (wormbase# F08C6.1a.1 and C02B4.1)
###Gene_Info_Comments GLEAN3_18297 ###
SR x3 - LRRtyp x4 - 7TM_1


No GPS - looks most like glycoprotein hormone receptors
###Gene_Info_Comments GLEAN3_23004 ###
contains only the reprolysin domain
###Gene_Info_Comments GLEAN3_15133 ###
LRRtyp x3 - 7TM-1

No GPS - looks most like glycoprotein/thyrotropin hormone receptors
###Gene_Info_Comments GLEAN3_09503 ###
the sequence is similar to both ADAM-TS6 and ADAM-TS10, but it looks like it is a particial sequence containing only a TSP1 and ADAMs spacer.  
###Gene_Info_Comments GLEAN3_11913 ###
roots the clade that contains vertibrate ADAM-TS2 and ADAM-TS3
###Gene_Info_Comments GLEAN3_20686 ###
LRRtyp x3 - 7TM-1

No GPS - looks most like glycoprotein/thyrotropin hormone receptors
###Gene_Info_Comments GLEAN3_02876 ###
LRRtyp x5 - 7TM-1

No GPS - looks most like glycoprotein/thyrotropin hormone receptors
###Gene_Info_Comments GLEAN3_06540 ###
LRRtyp x2 - 7TM-1

No GPS - looks most like glycoprotein/thyrotropin hormone receptors
###Gene_Info_Comments GLEAN3_23903 ###
This gene is more common in arthropods than in vertebrates. 

it has some domains that are characteristic of ADAM-TS proteins including TSP-1 and N-term ADAM spacer domain.  However, it has TY and KU domains which are novel for an ADAM-TS gene found in vertebrates, but normal for Papilin found in arthropods.  
###Gene_Info_Comments GLEAN3_02031 ###
This gene is more common in arthropods than in vertebrates. 

it has some domains that are characteristic of ADAM-TS proteins including TSP-1 and N-term ADAM spacer domain.  However, it has TY and KU domains which are novel for an ADAM-TS gene found in vertebrates, but normal for Papilin found in arthropods.  

###Gene_Info_Comments GLEAN3_20547 ###
similar to ADAM15, and to the ADAM15-like alleles, but distince enough to likely be a seperate gene.

Disintegrin/ACR/TM - domain structure characteristic of an ADAM.
Note that the gene is near to another gene with a very similar 
structure (GLEAN3_20545)

###Gene_Info_Comments GLEAN3_22131 ###
PREDICTED: Strongylocentrotus purpuratus similar to 106 kDa O-GlcNAc transferase-interacting protein (LOC589805),
###Gene_Info_Comments GLEAN3_23613 ###
PREDICTED: Strongylocentrotus purpuratus similar to Ubiquitin-conjugating enzyme E2-17 kDa (Ubiquitin-protein ligase)(Ubiquitin carrier protein) (Effete protein) (LOC586593) 
###Gene_Info_Comments GLEAN3_12475 ###
PREDICTED: Strongylocentrotus purpuratus similar to Separin
(Separase) (Caspase-like protein ESPL1) (Extra spindle poles-like 1 protein) (LOC576865), mRNA.

###Gene_Info_Comments GLEAN3_13841 ###
This GLEAN shares ~50% sequence identity over nearly its entire length with NP_444515.1 (Homo sapiens tektin-1).  The coding regions of GLEAN3_13841 and GLEAN3_06453 are identical.
###Gene_Info_Comments GLEAN3_00216 ###
5 EGF repeats, 13 HYR
###Gene_Info_Comments GLEAN3_03152 ###
This F-box protein is most similar to Fbw7 (drosophila archipeligo), the F-box protein that targets cyclin E for destruction.  However, it is significantly less similar to human Fbw7 than is Glean3_19951, which is likely the Fbw7 ortholog, and I have thus named it "Fbw7-like".
###Gene_Info_Comments GLEAN3_00883 ###
Likely to be incomplete.   Has 1 EGF, 1HYR, and 1 complete C lectin domain
###Gene_Info_Comments GLEAN3_06781 ###
See also Glean3_07933.
###Gene_Info_Comments GLEAN3_01854 ###
multiple EGFCa - GPS -7tm_2

good match to overall pattern of LNB-7TM-GPCRs
###Gene_Info_Comments GLEAN3_07778 ###
See also Glean3_13579, which encodes a nearly identical protein.
###Gene_Info_Comments GLEAN3_07916 ###
CUB plus multiple EGFCa - GPS NO TMs predicted

Apart from lack of -7tm_2 this is a good match to overall pattern of LNB-7TM-GPCRs
Probably missing C-terminus
###Gene_Info_Comments GLEAN3_21402 ###
CUB x2 -multiple EGFCa - HormR-GPS x2 - NOTMs predicted

Apart from lack of 7tm_2 a good match to overall pattern of LNB-7TM-GPCRs

C-terminus may be off
###Gene_Info_Comments GLEAN3_07933 ###
Note that the protein sequence of this model is identical to that of Glean3_06781.
###Gene_Info_Comments GLEAN3_07915 ###
multiple EGFCa - GPS -7tm_2

good match to overall pattern of LNB-7TM-GPCRs

Note adjacent gene is similar
###Gene_Info_Comments GLEAN3_18381 ###
PREDICTED: Strongylocentrotus purpuratus similar to Huntingtin (Huntingtons disease protein) (HD protein) (LOC590871), partial mRNA.

###Gene_Info_Comments GLEAN3_22093 ###
LDLa x2 - EGF x2- Ig - GPS - NO TMs predicted

Apart from lack of 7tm_2 domain this matches to overall pattern of LNB-7TM-GPCRs
May be missing C-terminus
###Gene_Info_Comments GLEAN3_26092 ###
multiple EGF_Ca - Ig - GPS - 7tm_2
Good match to overall pattern of LNB-7TM-GPCRs

###Gene_Info_Comments GLEAN3_19231 ###
CUB - multiple EGF_Ca - HormR - GPS - 7tm_2
Good match to overall pattern of LNB-7TM-GPCRs

###Gene_Info_Comments GLEAN3_08005 ###
2 EGF/EGFCas -GPS-7TM_2
matches to overall pattern of LNB-7TM-GPCRs

###Gene_Info_Comments GLEAN3_03304 ###
EGFCa -GPS-7TM_2
matches to overall pattern of LNB-7TM-GPCRs

###Gene_Info_Comments GLEAN3_21254 ###
2 EGFCa-GPS-7TM_2
matches to overall pattern of LNB-7TM-GPCRs

###Gene_Info_Comments GLEAN3_07742 ###
multiple EGFCa - Ig -GPS-  three TM segments
Apart from absence of complete 7tm_2 domain matches to overall pattern of LNB-7TM-GPCRs
May be missing C-terminus
###Gene_Info_Comments GLEAN3_13572 ###
multiple EGFCa - Ig -GPS-  single TM segment
Apart from absence of complete 7tm_2 domain matches to overall pattern of LNB-7TM-GPCRs
May be missing C-terminus
###Gene_Info_Comments GLEAN3_13579 ###
The amino acid sequence of this gene model is nearly identical to that of Glean3_07778, although they differ at their C-termini.
###Gene_Info_Comments GLEAN3_23715 ###
The sequence coded by this GLEAN is entirely contained in GLEAN3_05762; refer to this one for further annotation
###Gene_Info_Comments GLEAN3_12362 ###
EGF - Ig -GPS- 7tm_2
matches to overall pattern of LNB-7TM-GPCRs

###Gene_Info_Comments GLEAN3_22320 ###
LDLa - EGF - Ig -GPS- single TM
Apart from lack of full 7tm_1 or 7tm_2 domain
matches to overall pattern of LNB-7TM-GPCRs or to hormone receptors
###Gene_Info_Comments GLEAN3_14667 ###
CUB - SO - NO GPS- 7tm_2
Because of complete 7tm_2 domain matches to overall pattern of LNB-7TM-GPCRs but no GPS - could also be a member of the glycoprotein hormone receptor subfamily

###Gene_Info_Comments GLEAN3_24566 ###
CUB - NO GPS - 7tm_2
Because of complete 7tm_2 domain matches to overall pattern of LNB-7TM-GPCRs but no GPS - could also be a member of the glycoprotein hormone receptor subfamily

###Gene_Info_Comments GLEAN3_22019 ###
CUB - SO - one pfam:collagen repeat - NO GPS- 7tm_2
Because of complete 7tm_2 domain matches to overall pattern of LNB-7TM-GPCRs but no GPS - could also be a member of the glycoprotein hormone receptor subfamily

###Gene_Info_Comments GLEAN3_09826 ###
NOT Embryonically expressed.  May be pseudo-gene or adult-only gene.
###Gene_Info_Comments GLEAN3_23643 ###
NOT Embryonically expressed.  May be pseudo-gene or adult-only gene.
###Gene_Info_Comments GLEAN3_13269 ###
NOT Embryonically expressed.  May be pseudo-gene or adult-only gene.
###Gene_Info_Comments GLEAN3_28112 ###
NOT Embryonically expressed.  May be pseudo-gene or adult-only gene.
###Gene_Info_Comments GLEAN3_04496 ###
IDENTICAL sequence to GLEAN3_04096  
NOT Embryonically expressed.  May be pseudo-gene or adult-only gene.
###Gene_Info_Comments GLEAN3_14844 ###
SigPep-SRCR(3). Possibly partial.
###Gene_Info_Comments GLEAN3_04085 ###
Gene fragment.
###Gene_Info_Comments GLEAN3_14859 ###
SRCR(4). Possibly partial. See GLEAN3_14860.
###Gene_Info_Comments GLEAN3_15163 ###
CCP- 4XHYR - GPS-7TM_2

matches general pattern for LNB-7TM-GPCR receptors
###Gene_Info_Comments GLEAN3_25755 ###
Scaffoldi4711 has both ends of what appears to be a single beta integrin subunit - BetaD (GLEAN3_12985). Amino acids 1-420 appear to be a novel integrin beta subunit.  The scaffold has jointed two unrelated genes and I have corrected the model to remove the 3 exons that are not part of this subunit.  The model is incomplete.  
###Gene_Info_Comments GLEAN3_14860 ###
SRCR(2). Possibly partial. See GLEAN3_14859.
###Gene_Info_Comments GLEAN3_16723 ###
3 CCP - GPS-7TM_2

matches general pattern for LNB-7TM-GPCR receptors
###Gene_Info_Comments GLEAN3_14992 ###
SRCR(8)-TM. Possibly incomplete. See GLEAN3_14993, 14994.
###Gene_Info_Comments GLEAN3_20525 ###
FA58C- HYR - GPS-7TM_2

matches general pattern for LNB-7TM-GPCR receptors

SIMILAR DOMAIN COMPOSITION TO GLEAN3_13084
NOVEL ARCHITECTURE
###Gene_Info_Comments GLEAN3_14993 ###
RVT_1(probable prediction error?)-SRCR(4)-TM. Possibly incomplete. See GLEAN3_14992, 14994. 
###Gene_Info_Comments GLEAN3_05260 ###
Gene fragment
###Gene_Info_Comments GLEAN3_17226 ###
2x HYR -EGF - GPS-7TM_2

matches general pattern for LNB-7TM-GPCR receptors
###Gene_Info_Comments GLEAN3_04759 ###
NOT Embryonically expressed.  May be pseudo-gene or adult-only gene.
###Gene_Info_Comments GLEAN3_14994 ###
SRCR(9). Possibly incomplete. See GLEAN3_14992, 14993.
###Gene_Info_Comments GLEAN3_15123 ###
SRCR(3). Possibly incomplete. 
###Gene_Info_Comments GLEAN3_15325 ###
SRCR(2). Probably incomplete.
###Gene_Info_Comments GLEAN3_20144 ###
NOTE only last 170-ish aa of this GLEAN match to Rheb
###Gene_Info_Comments GLEAN3_15387 ###
SigPep-SRCR(4). Possibly incomplete.
###Gene_Info_Comments GLEAN3_15539 ###
SigPep-SRCR(2)-WSC. 
###Gene_Info_Comments GLEAN3_15548 ###
SRCR(4)-TM. Possibly incomplete.
###Gene_Info_Comments GLEAN3_15937 ###
SRCR(2)-TM. Possibly incomplete. See GLEAN3_15938.
###Gene_Info_Comments GLEAN3_15938 ###
SigPep-SRCR(3). Possibly incomplete. See GLEAN3_15937.
###Gene_Info_Comments GLEAN3_15989 ###
SRCR(2). Possibly incomplete.
###Gene_Info_Comments GLEAN3_07557 ###
Missing 5' segment, likely to be on a different scaffold
###Gene_Info_Comments GLEAN3_15991 ###
SigPep-SRCR(2)-TM. See GLEAN3_15989.
###Gene_Info_Comments GLEAN3_16195 ###
SRCR(3). Possibly incomplete.
###Gene_Info_Comments GLEAN3_16373 ###
SRCR(2)-TM. Possibly incomplete. See GLEAN3_16374.
###Gene_Info_Comments GLEAN3_16374 ###
SRCR(9). Possibly incomplete. See GLEAN3_16373.
###Gene_Info_Comments GLEAN3_08979 ###
This gene model predicts a protein with two separate domains that are normally on different genes: a p53 DNA binding domain at its N-terminus, and the SPOC domain (from the spen transcriptional regulator) at its C-terminus.  There is a large gap in the sequence of the scaffold that lies between these domains however, which leads me to suspect that these are actually two separate genes that have been artificially fused in the computational predictions.  If this is the case then the C-terminal gene with the SPOC domain would be a homologue of Drosophila transcriptional regulator split ends (spen), and should be named "Sp-Spen".
###Gene_Info_Comments GLEAN3_16531 ###
SRCR(2). Possibly incomplete.
###Gene_Info_Comments GLEAN3_16880 ###
SRCR(4). Possibly incomplete.
###Gene_Info_Comments GLEAN3_28905 ###
PREDICTED: similar to Prostaglandin E2 receptor, EP4 subtype 
(Prostanoid EP4 receptor) (PGE receptor, EP4 subtype)
###Gene_Info_Comments GLEAN3_15849 ###
poor homology (BLAST score = 5e-05)
###Gene_Info_Comments GLEAN3_10562 ###
NOT Embryonically Expressed, maybe be psuedo-gene or adult-only gene
###Gene_Info_Comments GLEAN3_03857 ###
PREDICTED: Strongylocentrotus purpuratus similar to thyroid hormone receptor interactor 12 (LOC578329), mRNA.

###Gene_Info_Comments GLEAN3_25176 ###
Significant overlap with Glean3_11072, which encodes the C-terminal portion of ATM.  The last 7 exons of this model are from a completely different gene (an oxidoreductase), and were artifactually fused to this gene as a result of incomplete sequence on the scaffold between them.
###Gene_Info_Comments GLEAN3_14345 ###
IDENTICAL to 01459 except at very 3' end
###Gene_Info_Comments GLEAN3_01459 ###
Identical to 14345 except at very 3' end.  
NOTE no tiling data or EST.  May be pseudogene or expressed in adult only.
###Gene_Info_Comments GLEAN3_26783 ###
This model the majority of the ATR protein, but is missing C-terminus, which is encoded in Glean3_11017.  The N-terminus (~200 amino acids) predicted by this model is quite different from vertebrate ATR.
###Gene_Info_Comments GLEAN3_11017 ###
This model appears to encode the C-terminus of the ATR protein.  There is significant overlap with Glean3_26783, which encodes the N-terminal portion (minus the N-terminus).
###Gene_Info_Comments GLEAN3_09522 ###
GLEAN3_23882 and GLEAN3_03142 are PTEN hits
###Gene_Info_Comments GLEAN3_24659 ###
Partial sequence.
###Gene_Info_Comments GLEAN3_22223 ###
Contains C-lectin and PAN- Apple domains. Aligns to carboxy end of human versican-like protein.
###Gene_Info_Comments GLEAN3_26772 ###
Partial sequence.
###Gene_Info_Comments GLEAN3_14959 ###
GLEAN3_26257 is a high score hit with less coverage of the mouse sequence used for Blast Query.
###Gene_Info_Comments GLEAN3_19405 ###
Base pairs 528-885 of this glean sequence are PNKP-like.
###Gene_Info_Comments GLEAN3_10324 ###
GLEAN3_21010 is a significant hit limited to N-terminal coverage of the human xrcc1 sequence as Query.
###Gene_Info_Comments GLEAN3_20872 ###
Partial sequence.
###Gene_Info_Comments GLEAN3_03158 ###
Similar to Slingshot 1 and 2.
###Gene_Info_Comments GLEAN3_00287 ###
N-term of protein similar to human vPARP. C-term has no similarity
###Gene_Info_Comments GLEAN3_00991 ###
N-term of gene
###Gene_Info_Comments GLEAN3_00992 ###
C-term of gene (see GLEAN3_00991)
###Gene_Info_Comments GLEAN3_02082 ###
similar to C elegans unnamed gene
###Gene_Info_Comments GLEAN3_09224 ###
Could be haplotype pair of GLEAN3_06396
###Gene_Info_Comments GLEAN3_14509 ###
Related to At1g79380/T8K14_20
###Gene_Info_Comments GLEAN3_22638 ###
Possible haplotype pair of GLEAN3_13171
###Gene_Info_Comments GLEAN3_26339 ###
N-terminal portion of this gene might be GLEAN3_18354
###Gene_Info_Comments GLEAN3_17320 ###
Possibly the N-term of GLEAN3_17321
###Gene_Info_Comments GLEAN3_17321 ###
Possibly the C-term of GLEAN3_17320
###Gene_Info_Comments GLEAN3_22501 ###
Partial sequence.
###Gene_Info_Comments GLEAN3_19984 ###
partial sequence. Also see GLEAN3_00776.
###Gene_Info_Comments GLEAN3_04377 ###
This GLEAN represents the sea urchin Outer Arm Dynein Light Chain 4, as defined by the Anthocidaris crassispina cDNA (gi|2754612|dbj|BAA24152.1| outer arm dynein light chain 4 [Anthocidaris crassispina]) and the RefSeq gi|72093505|ref|XP_794465.1| PREDICTED: similar to dynein, axonemal, light chain 4 [Strongylocentrotus purpuratus]. 
###Gene_Info_Comments GLEAN3_18404 ###
looks identical/very similar to Glean3_16657
###Gene_Info_Comments GLEAN3_11682 ###
The predicted amino-terminal 50 amino acids of GLEAN3_11682 do not agree with the comparable regions of other Dynein Light Chain-2 sequences and are almost certainly incorrect.  GLEAN3_11682 is essentially identical to GLEAN3_11683, and identical for an extended length with GLEAN3_11681 and GLEAN3_11684.
Because there is a good chance that they do not represent four distinct gene products, they were named as follows:
GLEAN3_11682: Sp-Dynein Light Chain-2-4a
GLEAN3_11681: Sp-Dynein Light Chain-2-4b
GLEAN3_11683: Sp-Dynein Light Chain-2-4c
GLEAN3_11684: Sp-Dynein Light Chain-2-4d.
###Gene_Info_Comments GLEAN3_24497 ###
GLEAN3_24497 is essentially identical to GLEAN3_24498, GLEAN3_24499, and GLEAN3_24500.  Because there is a good chance that they do not represent four distinct gene products, they were named as follows:
GLEAN3_24497: Sp-Dynein Light Chain-2-5a
GLEAN3_24498: Sp-Dynein Light Chain-2-5b
GLEAN3_24499: Sp-Dynein Light Chain-2-5c
GLEAN3_24500: Sp-Dynein Light Chain-2-5d.
###Gene_Info_Comments GLEAN3_24498 ###
GLEAN3_24498 is essentially identical to GLEAN3_24497, GLEAN3_24499, and GLEAN3_24500.  Because there is a good chance that they do not represent four distinct gene products, they were named as follows:
GLEAN3_24497: Sp-Dynein Light Chain-2-5a
GLEAN3_24498: Sp-Dynein Light Chain-2-5b
GLEAN3_24499: Sp-Dynein Light Chain-2-5c
GLEAN3_24500: Sp-Dynein Light Chain-2-5d.


###Gene_Info_Comments GLEAN3_24499 ###
GLEAN3_24499 is essentially identical to GLEAN3_24497, GLEAN3_24498, and GLEAN3_24500.  Because there is a good chance that they do not represent four distinct gene products, they were named as follows:
GLEAN3_24497: Sp-Dynein Light Chain-2-5a
GLEAN3_24498: Sp-Dynein Light Chain-2-5b
GLEAN3_24499: Sp-Dynein Light Chain-2-5c
GLEAN3_24500: Sp-Dynein Light Chain-2-5d.
###Gene_Info_Comments GLEAN3_24500 ###
GLEAN3_24500 is essentially identical to GLEAN3_24497, GLEAN3_24498, and GLEAN3_24499.  Because there is a good chance that they do not represent four distinct gene products, they were named as follows:
GLEAN3_24497: Sp-Dynein Light Chain-2-5a
GLEAN3_24498: Sp-Dynein Light Chain-2-5b
GLEAN3_24499: Sp-Dynein Light Chain-2-5c
GLEAN3_24500: Sp-Dynein Light Chain-2-5d.
###Gene_Info_Comments GLEAN3_11684 ###
Most of the predicted amino acid sequence of GLEAN3_11684 does not agree with the comparable regions of other Dynein Light Chain-2 sequences and is almost certainly incorrect.  GLEAN3_11684 is very similar to the neighboring gene models GLEAN3_11681, GLEAN3_11682, and GLEAN3_11683.
Because there is a good chance that they do not represent four distinct gene products, they were named as follows:
GLEAN3_11682: Sp-Dynein Light Chain-2-4a
GLEAN3_11681: Sp-Dynein Light Chain-2-4b
GLEAN3_11683: Sp-Dynein Light Chain-2-4c
GLEAN3_11684: Sp-Dynein Light Chain-2-4d.
###Gene_Info_Comments GLEAN3_11681 ###
The predicted amino-terminal 150 amino acids of GLEAN3_11681 do not agree with the comparable regions of other Dynein Light Chain-2 sequences and may be incorrect.  However, GLEAN3_11681 is identical with RefSeq XP_795373.1. GLEAN3_116821 is essentially identical for an extended length with GLEAN3_11682, GLEAN3_11683, and GLEAN3_11684.
Because there is a good chance that they do not represent four distinct gene products, they were named as follows:
GLEAN3_11682: Sp-Dynein Light Chain-2-4a
GLEAN3_11681: Sp-Dynein Light Chain-2-4b
GLEAN3_11683: Sp-Dynein Light Chain-2-4c
GLEAN3_11684: Sp-Dynein Light Chain-2-4d.
###Gene_Info_Comments GLEAN3_11683 ###
The predicted amino-terminal 50 amino acids of GLEAN3_11683 do not agree with the comparable regions of other Dynein Light Chain-2 sequences and are almost certainly incorrect.  GLEAN3_11683 is essentially identical to GLEAN3_11682, and identical for an extended length with GLEAN3_11681 and GLEAN3_11684.
Because there is a good chance that they do not represent four distinct gene products, they were named as follows:
GLEAN3_11682: Sp-Dynein Light Chain-2-4a
GLEAN3_11681: Sp-Dynein Light Chain-2-4b
GLEAN3_11683: Sp-Dynein Light Chain-2-4c
GLEAN3_11684: Sp-Dynein Light Chain-2-4d.
###Gene_Info_Comments GLEAN3_08800 ###
The predicted amino acid sequence of GLEAN3_08800 is identical to those of GLEAN3_08799 and GLEAN3_08800.
Because there is a good chance that they do not represent three distinct gene products, they were named as follows:
GLEAN3_08799: Sp-Dynein Light Chain-2-3a
GLEAN3_08800: Sp-Dynein Light Chain-2-3b
GLEAN3_08801: Sp-Dynein Light Chain-2-3c
###Gene_Info_Comments GLEAN3_10140 ###
very weak signal.  no est.  may be pseudogene
###Gene_Info_Comments GLEAN3_11875 ###
Partial duplicate prediction for GLEAN3_11837
###Gene_Info_Comments GLEAN3_17987 ###
Contains MMR_HSR1 domain (GTPase of unknown function domain)
###Gene_Info_Comments GLEAN3_11970 ###
IDENTICAL TO 12085
###Gene_Info_Comments GLEAN3_11134 ###
Duplicate partial prediction for GLEAN3_01605
###Gene_Info_Comments GLEAN3_10689 ###
Incomplete prediction? GLEAN3_09456 is almost complete duplicate prediction for GLEAN3_10689.
###Gene_Info_Comments GLEAN3_28471 ###
No signal in tiling array or EST.  May be pseudogene or adult only expression.
###Gene_Info_Comments GLEAN3_21593 ###
This model appears to encode the middle part of the ATM protein.  The N- and C-terminal parts are encoded by Glean3_05652 and Glean3_11072, respectively.  There is sequence overlap between all three models.
###Gene_Info_Comments GLEAN3_11072 ###
This model encodes the C-terminus of the ATM protein.  The other parts are encoded by Glean3_05652 (N-terminus) and Glean3_21593 (middle).  There is sequence overlap between all three models.  This model also has significant sequence overlap with Glean3_25176.
###Gene_Info_Comments GLEAN3_05652 ###
This model appears to encode the N-terminal portion of ATM.  The rest of the protein falls on Glean3_11072 (C-terminus) and Glean3_21593 (middle).  There is sequence overlap between all three models.
###Gene_Info_Comments GLEAN3_27613 ###
BLAST identities confined to RefSeq only. Name assigned by phylogenetic methods (ME tree with ML branch lengths or heuristic ML). Percent ID to " cytochrome P450, family 17, subfamily A, polypeptide 1 [Danio rerio]" (NP_997971.1) is 25.41% over 488 BLAST alignment positions. 383 of 817 Muscle alignment positions masked (46.800 %; 434 positions used for tree generation) with a Muscle scorefile cutoff of 25.
###Gene_Info_Comments GLEAN3_27796 ###
BLAST identities confined to RefSeq only. Name assigned by phylogenetic methods (ME tree with ML branch lengths or heuristic ML). Percent ID to " cytochrome P450, family 3, subfamily a, polypeptide 13 [Rattus norvegicus]" (NP_671739.1) is 40.64% over 342 BLAST alignment positions. 178 of 677 Muscle alignment positions masked (26.200 %; 499 positions used for tree generation) with a Muscle scorefile cutoff of 25. GLEAN3_27795 may be N terminus. Not very CYP3 like.   COMMENTS from arm@stowers-institute.org:  partial, missing N-terminus
###Gene_Info_Comments GLEAN3_28152 ###
BLAST identities confined to RefSeq only. Name assigned by phylogenetic methods (ME tree with ML branch lengths or heuristic ML). Percent ID to " cytochrome P450, family 4, subfamily v [Gallus gallus]" (NP_001001879.1) is 49.20% over 502 BLAST alignment positions. 531 of 956 Muscle alignment positions masked (55.500 %; 425 positions used for tree generation) with a Muscle scorefile cutoff of 25.
###Gene_Info_Comments GLEAN3_28699 ###
BLAST identities confined to RefSeq only. Name assigned by phylogenetic methods (ME tree with ML branch lengths or heuristic ML). Percent ID to " cytochrome P450, family 2, subfamily U, polypeptide 1 [Homo sapiens]" (NP_898898.1) is 38.24% over 421 BLAST alignment positions. 271 of 732 Muscle alignment positions masked (37.000 %; 461 positions used for tree generation) with a Muscle scorefile cutoff of 25.   COMMENTS from arm@stowers-institute.org:  partial, missing N-terminus
###Gene_Info_Comments GLEAN3_28922 ###
BLAST identities confined to RefSeq only. Name assigned by phylogenetic methods (ME tree with ML branch lengths or heuristic ML). Percent ID to " Cyp2f2 protein [Xenopus tropicalis]" (NP_001010999.1) is 31.80% over 217 BLAST alignment positions. 138 of 589 Muscle alignment positions masked (23.400 %; 451 positions used for tree generation) with a Muscle scorefile cutoff of 25.   COMMENTS from arm@stowers-institute.org:  partial, missing C-terminus
###Gene_Info_Comments GLEAN3_28934 ###
BLAST identities confined to RefSeq only. Name assigned by phylogenetic methods (ME tree with ML branch lengths or heuristic ML). Percent ID to " cytochrome P450, family 2, subfamily J, polypeptide 2, B [Danio rerio]" (NP_956914.1) is 38.64% over 471 BLAST alignment positions. 238 of 714 Muscle alignment positions masked (33.300 %; 476 positions used for tree generation) with a Muscle scorefile cutoff of 25.
###Gene_Info_Comments GLEAN3_06272 ###
incomplete sequence
###Gene_Info_Comments GLEAN3_16605 ###
Identical to 11157 (11157 is missing aa 1-36)
###Gene_Info_Comments GLEAN3_11157 ###
IDENTICAL TO 16605 (EXCEPT 11157 is missing aa 1-36 of 16605)
###Gene_Info_Comments GLEAN3_01094 ###
PARTIAL, MISSING N-TERMINUS
###Gene_Info_Comments GLEAN3_12205 ###
Partial gene, missing the N-terminal half.
###Gene_Info_Comments GLEAN3_08978 ###
This Glean3 model probably encodes the N-terminus of the p53 homologue predicted by the 5' end of Glean3_08979, and should be fused therewith.  The third exon is probably not real, as there is no evidence for expression in the tiling data.  At the same time, Glean3_08979 probably artifactually fuses two different genes, and needs to be broken up (see annotation to that gene model).
###Gene_Info_Comments GLEAN3_15880 ###
IDENTICAL TO 08410
###Gene_Info_Comments GLEAN3_13863 ###
IDENTICAL TO 18282
###Gene_Info_Comments GLEAN3_04692 ###
Partial gene.  Missing N-terminus.
###Gene_Info_Comments GLEAN3_12404 ###
Prediction is incomplete.
###Gene_Info_Comments GLEAN3_13183 ###
Genescan predicted as sea squirt Halocynthia roretzi troponin 1
###Gene_Info_Comments GLEAN3_24320 ###
BLAST identities confined to RefSeq only. Name assigned by phylogenetic methods (ME tree with ML branch lengths or heuristic ML). Percent ID to " cytochrome P450, family 2, subfamily U, polypeptide 1 [Homo sapiens]" (NP_898898.1) is 46.11% over 193 BLAST alignment positions. 445 of 886 Muscle alignment positions masked (50.200 %; 441 positions used for tree generation) with a Muscle scorefile cutoff of 25. partial   COMMENTS from arm@stowers-institute.org:  missing stretch in middle
###Gene_Info_Comments GLEAN3_25299 ###
BLAST identities confined to RefSeq only. Name assigned by phylogenetic methods (ME tree with ML branch lengths or heuristic ML). Percent ID to " hypothetical protein LOC393108 [Danio rerio]" (NP_956433.1) is 41.15% over 384 BLAST alignment positions. 225 of 696 Muscle alignment positions masked (32.300 %; 471 positions used for tree generation) with a Muscle scorefile cutoff of 25.   COMMENTS from arm@stowers-institute.org:  missing N-terminus
###Gene_Info_Comments GLEAN3_25595 ###
BLAST identities confined to RefSeq only. Name assigned by phylogenetic methods (ME tree with ML branch lengths or heuristic ML). Percent ID to " cytochrome P450, family 51 [Strongylocentrotus purpuratus]" (NP_001001906.1) is 99.66% over 297 BLAST alignment positions. 436 of 902 Muscle alignment positions masked (48.300 %; 466 positions used for tree generation) with a Muscle scorefile cutoff of 25.   COMMENTS from arm@stowers-institute.org:  missing N&C-terminus
###Gene_Info_Comments GLEAN3_25829 ###
BLAST identities confined to RefSeq only. Name assigned by phylogenetic methods (ME tree with ML branch lengths or heuristic ML). Percent ID to " cytochrome P450, family 46, subfamily a, polypeptide 1 [Mus musculus]" (NP_034140.1) is 40.65% over 492 BLAST alignment positions. 239 of 633 Muscle alignment positions masked (37.700 %; 394 positions used for tree generation) with a Muscle scorefile cutoff of 25.
###Gene_Info_Comments GLEAN3_25830 ###
BLAST identities confined to RefSeq only. Name assigned by phylogenetic methods (ME tree with ML branch lengths or heuristic ML). Percent ID to " hypothetical protein LOC613109 [Xenopus tropicalis]" (NP_001027517.1) is 39.67% over 489 BLAST alignment positions. 241 of 639 Muscle alignment positions masked (37.700 %; 398 positions used for tree generation) with a Muscle scorefile cutoff of 25.   COMMENTS from arm@stowers-institute.org:  missing some N-terminus residues
###Gene_Info_Comments GLEAN3_25863 ###
BLAST identities confined to RefSeq only. Name assigned by phylogenetic methods (ME tree with ML branch lengths or heuristic ML). Percent ID to " hypothetical protein LOC393108 [Danio rerio]" (NP_956433.1) is 38.95% over 439 BLAST alignment positions. 235 of 699 Muscle alignment positions masked (33.600 %; 464 positions used for tree generation) with a Muscle scorefile cutoff of 25.
###Gene_Info_Comments GLEAN3_25956 ###
BLAST identities confined to RefSeq only. Name assigned by phylogenetic methods (ME tree with ML branch lengths or heuristic ML). Percent ID to " cytochrome P450, family 2, subfamily U, polypeptide 1 [Homo sapiens]" (NP_898898.1) is 33.50% over 397 BLAST alignment positions. 670 of 1128 Muscle alignment positions masked (59.300 %; 458 positions used for tree generation) with a Muscle scorefile cutoff of 25.   COMMENTS from arm@stowers-institute.org:  missing N-terminus
###Gene_Info_Comments GLEAN3_25969 ###
Missing N terminus due to scaffold truncation.
Single exon!
BLAST identities confined to RefSeq only. Name assigned by phylogenetic methods (ME tree with ML branch lengths or heuristic ML). Percent ID to " cytochrome P450, family 1, subfamily A, polypeptide 1 [Gallus gallus]" (NP_990477.1) is 36.00% over 300 BLAST alignment positions. 184 of 658 Muscle alignment positions masked (27.900 %; 474 positions used for tree generation) with a Muscle scorefile cutoff of 25.   COMMENTS from arm@stowers-institute.org:  missing N-terminus
###Gene_Info_Comments GLEAN3_26188 ###
Incomplete - runs off end of scaffold missing last 2 exons
BLAST identities confined to RefSeq only. Name assigned by phylogenetic methods (ME tree with ML branch lengths or heuristic ML). Percent ID to " thromboxane A synthase 1 [Danio rerio]" (NP_991172.1) is 27.33% over 311 BLAST alignment positions. 89 of 573 Muscle alignment positions masked (15.500 %; 484 positions used for tree generation) with a Muscle scorefile cutoff of 25.
###Gene_Info_Comments GLEAN3_26360 ###
BLAST identities confined to RefSeq only. Name assigned by phylogenetic methods (ME tree with ML branch lengths or heuristic ML). Percent ID to " cytochrome P450, family 2, subfamily j, polypeptide 9 [Rattus norvegicus]" (NP_786942.1) is 40.98% over 388 BLAST alignment positions. 254 of 733 Muscle alignment positions masked (34.600 %; 479 positions used for tree generation) with a Muscle scorefile cutoff of 25.   COMMENTS from arm@stowers-institute.org:  missing N-terminus
###Gene_Info_Comments GLEAN3_26373 ###
BLAST identities confined to RefSeq only. Name assigned by phylogenetic methods (ME tree with ML branch lengths or heuristic ML). Percent ID to " cytochrome P450, family 2, subfamily u, polypeptide 1 [Rattus norvegicus]" (NP_001019950.1) is 35.43% over 446 BLAST alignment positions. 358 of 810 Muscle alignment positions masked (44.100 %; 452 positions used for tree generation) with a Muscle scorefile cutoff of 25.
###Gene_Info_Comments GLEAN3_26477 ###
BLAST identities confined to RefSeq only. Name assigned by phylogenetic methods (ME tree with ML branch lengths or heuristic ML). Percent ID to " hypothetical protein LOC572000 [Danio rerio]" (NP_001020728.1) is 33.95% over 433 BLAST alignment positions. 269 of 726 Muscle alignment positions masked (37.000 %; 457 positions used for tree generation) with a Muscle scorefile cutoff of 25.   
Exon 2 duplicated as exon3: misassembly problem
COMMENTS from arm@stowers-institute.org:  extra N-terminus
###Gene_Info_Comments GLEAN3_27153 ###
BLAST identities confined to RefSeq only. Name assigned by phylogenetic methods (ME tree with ML branch lengths or heuristic ML). Percent ID to " hypothetical protein LOC393204 [Danio rerio]" (NP_956529.1) is 30.21% over 480 BLAST alignment positions. 1574 of 1945 Muscle alignment positions masked (80.900 %; 371 positions used for tree generation) with a Muscle scorefile cutoff of 25. GLEAN3_27152 may be N terminus   COMMENTS from arm@stowers-institute.org:  missing central stretch and C-terminus
###Gene_Info_Comments GLEAN3_20876 ###
BLAST identities confined to RefSeq only. Name assigned by phylogenetic methods (ME tree with ML branch lengths or heuristic ML). Percent ID to " cytochrome P450, subfamily XXVIA, polypeptide 1 [Danio rerio]" (NP_571221.2) is 35.74% over 484 BLAST alignment positions. 246 of 704 Muscle alignment positions masked (34.900 %; 458 positions used for tree generation) with a Muscle scorefile cutoff of 25.
###Gene_Info_Comments GLEAN3_21087 ###
BLAST identities confined to RefSeq only. Name assigned by phylogenetic methods (ME tree with ML branch lengths or heuristic ML). Percent ID to " hypothetical protein LOC334097 [Danio rerio]" (NP_997884.1) is 34.84% over 442 BLAST alignment positions. 256 of 679 Muscle alignment positions masked (37.700 %; 423 positions used for tree generation) with a Muscle scorefile cutoff of 25. Early diverging CYP1?
###Gene_Info_Comments GLEAN3_21185 ###
BLAST identities confined to RefSeq only. Name assigned by phylogenetic methods (ME tree with ML branch lengths or heuristic ML). Percent ID to " cytochrome P450, family 17, subfamily A, polypeptide 1 [Danio rerio]" (NP_997971.1) is 32.89% over 377 BLAST alignment positions. 202 of 609 Muscle alignment positions masked (33.100 %; 407 positions used for tree generation) with a Muscle scorefile cutoff of 25.   COMMENTS from arm@stowers-institute.org:  missing N- and C-terminus
###Gene_Info_Comments GLEAN3_21251 ###
BLAST identities confined to RefSeq only. Name assigned by phylogenetic methods (ME tree with ML branch lengths or heuristic ML). Percent ID to " cytochrome P450, subfamily 3A, polypeptide 62 [Rattus norvegicus]" (NP_001019403.1) is 31.63% over 332 BLAST alignment positions. 83 of 580 Muscle alignment positions masked (14.300 %; 497 positions used for tree generation) with a Muscle scorefile cutoff of 25.   COMMENTS from arm@stowers-institute.org:  missing C-terminus and stretch in middle
###Gene_Info_Comments GLEAN3_22109 ###
BLAST identities confined to RefSeq only. Name assigned by phylogenetic methods (ME tree with ML branch lengths or heuristic ML). Percent ID to " cytochrome P450, family 2, subfamily U, polypeptide 1 [Homo sapiens]" (NP_898898.1) is 35.99% over 389 BLAST alignment positions. 229 of 671 Muscle alignment positions masked (34.100 %; 442 positions used for tree generation) with a Muscle scorefile cutoff of 25.   COMMENTS from arm@stowers-institute.org:  missing N-terminus
###Gene_Info_Comments GLEAN3_22110 ###
BLAST identities confined to RefSeq only. Name assigned by phylogenetic methods (ME tree with ML branch lengths or heuristic ML). Percent ID to " cytochrome P450, family 2, subfamily U, polypeptide 1 [Homo sapiens]" (NP_898898.1) is 36.19% over 409 BLAST alignment positions. 211 of 667 Muscle alignment positions masked (31.600 %; 456 positions used for tree generation) with a Muscle scorefile cutoff of 25.
###Gene_Info_Comments GLEAN3_22432 ###
BLAST identities confined to RefSeq only. Name assigned by phylogenetic methods (ME tree with ML branch lengths or heuristic ML). Percent ID to " hypothetical protein LOC393108 [Danio rerio]" (NP_956433.1) is 38.48% over 434 BLAST alignment positions. 268 of 728 Muscle alignment positions masked (36.800 %; 460 positions used for tree generation) with a Muscle scorefile cutoff of 25.
###Gene_Info_Comments GLEAN3_22590 ###
BLAST identities confined to RefSeq only. Name assigned by phylogenetic methods (ME tree with ML branch lengths or heuristic ML). Percent ID to " cytochrome P450, family 2, subfamily U, polypeptide 1 [Homo sapiens]" (NP_898898.1) is 37.72% over 501 BLAST alignment positions. 299 of 770 Muscle alignment positions masked (38.800 %; 471 positions used for tree generation) with a Muscle scorefile cutoff of 25.
###Gene_Info_Comments GLEAN3_22593 ###
BLAST identities confined to RefSeq only. Name assigned by phylogenetic methods (ME tree with ML branch lengths or heuristic ML). Percent ID to " cytochrome P450, family 2, subfamily U, polypeptide 1 [Homo sapiens]" (NP_898898.1) is 38.52% over 514 BLAST alignment positions. 392 of 833 Muscle alignment positions masked (47.000 %; 441 positions used for tree generation) with a Muscle scorefile cutoff of 25.
###Gene_Info_Comments GLEAN3_23067 ###
BLAST identities confined to RefSeq only. Name assigned by phylogenetic methods (ME tree with ML branch lengths or heuristic ML). Percent ID to " cytochrome P450, family 2, subfamily u, polypeptide 1 [Rattus norvegicus]" (NP_001019950.1) is 34.01% over 444 BLAST alignment positions. 334 of 773 Muscle alignment positions masked (43.200 %; 439 positions used for tree generation) with a Muscle scorefile cutoff of 25.
###Gene_Info_Comments GLEAN3_23068 ###
BLAST identities confined to RefSeq only. Name assigned by phylogenetic methods (ME tree with ML branch lengths or heuristic ML). Percent ID to " cytochrome P450, family 2, subfamily u, polypeptide 1 [Rattus norvegicus]" (NP_001019950.1) is 32.14% over 448 BLAST alignment positions. 242 of 682 Muscle alignment positions masked (35.400 %; 440 positions used for tree generation) with a Muscle scorefile cutoff of 25.
###Gene_Info_Comments GLEAN3_24078 ###
BLAST identities confined to RefSeq only. Name assigned by phylogenetic methods (ME tree with ML branch lengths or heuristic ML). Percent ID to " hypothetical protein LOC393108 [Danio rerio]" (NP_956433.1) is 33.23% over 310 BLAST alignment positions. 251 of 718 Muscle alignment positions masked (34.900 %; 467 positions used for tree generation) with a Muscle scorefile cutoff of 25.
###Gene_Info_Comments GLEAN3_24113 ###
BLAST identities confined to RefSeq only. Name assigned by phylogenetic methods (ME tree with ML branch lengths or heuristic ML). Percent ID to " MGC107863 protein [Xenopus tropicalis]" (NP_001015719.1) is 41.63% over 430 BLAST alignment positions. 248 of 726 Muscle alignment positions masked (34.100 %; 478 positions used for tree generation) with a Muscle scorefile cutoff of 25.
###Gene_Info_Comments GLEAN3_18515 ###
BLAST identities confined to RefSeq only. Name assigned by phylogenetic methods (ME tree with ML branch lengths or heuristic ML). Percent ID to " cytochrome P450, family 2, subfamily U, polypeptide 1 [Homo sapiens]" (NP_898898.1) is 37.69% over 459 BLAST alignment positions. 212 of 662 Muscle alignment positions masked (32.000 %; 450 positions used for tree generation) with a Muscle scorefile cutoff of 25.
###Gene_Info_Comments GLEAN3_19082 ###
BLAST identities confined to RefSeq only. Name assigned by phylogenetic methods (ME tree with ML branch lengths or heuristic ML). Percent ID to " cytochrome P450, family 2, subfamily U, polypeptide 1 [Homo sapiens]" (NP_898898.1) is 43.40% over 235 BLAST alignment positions. 479 of 889 Muscle alignment positions masked (53.800 %; 410 positions used for tree generation) with a Muscle scorefile cutoff of 25.   COMMENTS from arm@stowers-institute.org:  partial, missing N-terminus
###Gene_Info_Comments GLEAN3_19464 ###
BLAST identities confined to RefSeq only. Name assigned by phylogenetic methods (ME tree with ML branch lengths or heuristic ML). Percent ID to " cytochrome P450 A 37 [Gallus gallus]" (NP_001001751.1) is 39.19% over 296 BLAST alignment positions. 41 of 534 Muscle alignment positions masked (7.600 %; 493 positions used for tree generation) with a Muscle scorefile cutoff of 25.   COMMENTS from arm@stowers-institute.org:  missing N-terminus
###Gene_Info_Comments GLEAN3_19883 ###
BLAST identities confined to RefSeq only. Name assigned by phylogenetic methods (ME tree with ML branch lengths or heuristic ML). Percent ID to " cytochrome P450, family 1, subfamily B, polypeptide 1 [Danio rerio]" (NP_001013285.2) is 33.33% over 423 BLAST alignment positions. 264 of 693 Muscle alignment positions masked (38.000 %; 429 positions used for tree generation) with a Muscle scorefile cutoff of 25.
###Gene_Info_Comments GLEAN3_19898 ###
BLAST identities confined to RefSeq only. Name assigned by phylogenetic methods (ME tree with ML branch lengths or heuristic ML). Percent ID to " cytochrome P450, family 2, subfamily J, polypeptide 2 [Homo sapiens]" (NP_000766.2) is 32.91% over 468 BLAST alignment positions. 170 of 636 Muscle alignment positions masked (26.700 %; 466 positions used for tree generation) with a Muscle scorefile cutoff of 25.
###Gene_Info_Comments GLEAN3_19899 ###
BLAST identities confined to RefSeq only. Name assigned by phylogenetic methods (ME tree with ML branch lengths or heuristic ML). Percent ID to " cytochrome P450, family 2, subfamily J, polypeptide 2, B [Danio rerio]" (NP_956914.1) is 34.40% over 436 BLAST alignment positions. 223 of 693 Muscle alignment positions masked (32.100 %; 470 positions used for tree generation) with a Muscle scorefile cutoff of 25.
###Gene_Info_Comments GLEAN3_20229 ###
BLAST identities confined to RefSeq only. Name assigned by phylogenetic methods (ME tree with ML branch lengths or heuristic ML). Percent ID to " cytochrome P450, family 4, subfamily v [Gallus gallus]" (NP_001001879.1) is 52.84% over 388 BLAST alignment positions. 738 of 1149 Muscle alignment positions masked (64.200 %; 411 positions used for tree generation) with a Muscle scorefile cutoff of 25.   COMMENTS from arm@stowers-institute.org:  missing N-terminus
###Gene_Info_Comments GLEAN3_20233 ###
BLAST identities confined to RefSeq only. Name assigned by phylogenetic methods (ME tree with ML branch lengths or heuristic ML). Percent ID to " cytochrome P450, family 2, subfamily d, polypeptide 22 [Rattus norvegicus]" (NP_612524.1) is 38.37% over 258 BLAST alignment positions. 201 of 633 Muscle alignment positions masked (31.700 %; 432 positions used for tree generation) with a Muscle scorefile cutoff of 25. N terminus is GLEAN3_20232   COMMENTS from arm@stowers-institute.org:  missing N-terminus
###Gene_Info_Comments GLEAN3_20234 ###
BLAST identities confined to RefSeq only. Name assigned by phylogenetic methods (ME tree with ML branch lengths or heuristic ML). Percent ID to " cytochrome P450, family 2, subfamily j, polypeptide 11 [Mus musculus]" (NP_001004141.1) is 33.48% over 466 BLAST alignment positions. 80 of 538 Muscle alignment positions masked (14.800 %; 458 positions used for tree generation) with a Muscle scorefile cutoff of 25.
###Gene_Info_Comments GLEAN3_20752 ###
Fragment of CYP2. Possible missasembly with GLEAN3_20751

BLAST identities confined to RefSeq only. Name assigned by phylogenetic methods (ME tree with ML branch lengths or heuristic ML). Percent ID to " cytochrome P450, family 2, subfamily U, polypeptide 1 [Homo sapiens]" (NP_898898.1) is 46.67% over 225 BLAST alignment positions. 457 of 896 Muscle alignment positions masked (51.000 %; 439 positions used for tree generation) with a Muscle scorefile cutoff of 25. 
###Gene_Info_Comments GLEAN3_20753 ###
BLAST identities confined to RefSeq only. Name assigned by phylogenetic methods (ME tree with ML branch lengths or heuristic ML). Percent ID to " cytochrome P450, family 2, subfamily U, polypeptide 1 [Homo sapiens]" (NP_898898.1) is 37.15% over 463 BLAST alignment positions. 214 of 667 Muscle alignment positions masked (32.000 %; 453 positions used for tree generation) with a Muscle scorefile cutoff of 25.
###Gene_Info_Comments GLEAN3_20754 ###
BLAST identities confined to RefSeq only. Name assigned by phylogenetic methods (ME tree with ML branch lengths or heuristic ML). Percent ID to " cytochrome P450, family 2, subfamily U, polypeptide 1 [Homo sapiens]" (NP_898898.1) is 40.14% over 436 BLAST alignment positions. 183 of 666 Muscle alignment positions masked (27.400 %; 483 positions used for tree generation) with a Muscle scorefile cutoff of 25.
###Gene_Info_Comments GLEAN3_20756 ###
BLAST identities confined to RefSeq only. Name assigned by phylogenetic methods (ME tree with ML branch lengths or heuristic ML). Percent ID to " cytochrome P450, family 2, subfamily U, polypeptide 1 [Homo sapiens]" (NP_898898.1) is 39.10% over 445 BLAST alignment positions. 193 of 642 Muscle alignment positions masked (30.000 %; 449 positions used for tree generation) with a Muscle scorefile cutoff of 25.
###Gene_Info_Comments GLEAN3_14843 ###
BLAST identities confined to RefSeq only. Name assigned by phylogenetic methods (ME tree with ML branch lengths or heuristic ML). Percent ID to " cytochrome P450, family 1, subfamily A, polypeptide 1 [Gallus gallus]" (NP_990477.1) is 34.70% over 464 BLAST alignment positions. 173 of 659 Muscle alignment positions masked (26.200 %; 486 positions used for tree generation) with a Muscle scorefile cutoff of 25.
###Gene_Info_Comments GLEAN3_15096 ###
BLAST identities confined to RefSeq only. Name assigned by phylogenetic methods (ME tree with ML branch lengths or heuristic ML). Percent ID to " cytochrome P450, family 2, subfamily j, polypeptide 11 [Mus musculus]" (NP_001004141.1) is 29.45% over 292 BLAST alignment positions. 51 of 521 Muscle alignment positions masked (9.700 %; 470 positions used for tree generation) with a Muscle scorefile cutoff of 25.   COMMENTS from arm@stowers-institute.org:  partial, missing N- and C-terminus
###Gene_Info_Comments GLEAN3_15120 ###
Partial sequence.     BLAST identities confined to RefSeq only. Name assigned by phylogenetic methods (ME tree with ML branch lengths or heuristic ML). Percent ID to " hypothetical protein LOC406641 [Danio rerio]" (NP_998497.1) is 39.04% over 146 BLAST alignment positions. 3167 of 3510 Muscle alignment positions masked (90.200 %; 343 positions used for tree generation) with a Muscle scorefile cutoff of 25.   COMMENTS from arm@stowers-institute.org:  partial, missing N-terminus
###Gene_Info_Comments GLEAN3_15256 ###
BLAST identities confined to RefSeq only. Name assigned by phylogenetic methods (ME tree with ML branch lengths or heuristic ML). Percent ID to " cytochrome P450, family 2, subfamily U, polypeptide 1 [Homo sapiens]" (NP_898898.1) is 36.93% over 501 BLAST alignment positions. 179 of 644 Muscle alignment positions masked (27.700 %; 465 positions used for tree generation) with a Muscle scorefile cutoff of 25.
###Gene_Info_Comments GLEAN3_15442 ###
BLAST identities confined to RefSeq only. Name assigned by phylogenetic methods (ME tree with ML branch lengths or heuristic ML). Percent ID to " cytochrome P450, family 2, subfamily U, polypeptide 1 [Homo sapiens]" (NP_898898.1) is 28.47% over 439 BLAST alignment positions. 351 of 798 Muscle alignment positions masked (43.900 %; 447 positions used for tree generation) with a Muscle scorefile cutoff of 25.
###Gene_Info_Comments GLEAN3_16251 ###
BLAST identities confined to RefSeq only. Name assigned by phylogenetic methods (ME tree with ML branch lengths or heuristic ML). Percent ID to " MGC107863 protein [Xenopus tropicalis]" (NP_001015719.1) is 39.87% over 449 BLAST alignment positions. 406 of 858 Muscle alignment positions masked (47.300 %; 452 positions used for tree generation) with a Muscle scorefile cutoff of 25. GLEAN3_16152 may be C terminus   COMMENTS from arm@stowers-institute.org:  missing N-terminus residues
###Gene_Info_Comments GLEAN3_16442 ###
BLAST identities confined to RefSeq only. Name assigned by phylogenetic methods (ME tree with ML branch lengths or heuristic ML). Percent ID to " cytochrome P450, family 2, subfamily u, polypeptide 1 [Rattus norvegicus]" (NP_001019950.1) is 36.18% over 398 BLAST alignment positions. 353 of 803 Muscle alignment positions masked (43.900 %; 450 positions used for tree generation) with a Muscle scorefile cutoff of 25.   COMMENTS from arm@stowers-institute.org:  missing C-terminus
###Gene_Info_Comments GLEAN3_16816 ###
BLAST identities confined to RefSeq only. Name assigned by phylogenetic methods (ME tree with ML branch lengths or heuristic ML). Percent ID to " cytochrome P450, family 2, subfamily u, polypeptide 1 [Rattus norvegicus]" (NP_001019950.1) is 38.73% over 377 BLAST alignment positions. 194 of 646 Muscle alignment positions masked (30.000 %; 452 positions used for tree generation) with a Muscle scorefile cutoff of 25.
###Gene_Info_Comments GLEAN3_17582 ###
Missing middle exon - part of I helix - due to missing contig data on scaffold.
BLAST identities confined to RefSeq only. Name assigned by phylogenetic methods (ME tree with ML branch lengths or heuristic ML). Percent ID to " cytochrome P450, family 1, subfamily B, polypeptide 1 [Danio rerio]" (NP_001013285.2) is 27.59% over 464 BLAST alignment positions. 1535 of 1933 Muscle alignment positions masked (79.400 %; 398 positions used for tree generation) with a Muscle scorefile cutoff of 25.   COMMENTS from arm@stowers-institute.org:  missing C-terminus
###Gene_Info_Comments GLEAN3_17986 ###
BLAST identities confined to RefSeq only. Name assigned by phylogenetic methods (ME tree with ML branch lengths or heuristic ML). Percent ID to " cytochrome P450, family 2, subfamily U, polypeptide 1 [Homo sapiens]" (NP_898898.1) is 36.96% over 514 BLAST alignment positions. 328 of 785 Muscle alignment positions masked (41.700 %; 457 positions used for tree generation) with a Muscle scorefile cutoff of 25.
###Gene_Info_Comments GLEAN3_18242 ###
BLAST identities confined to RefSeq only. Name assigned by phylogenetic methods (ME tree with ML branch lengths or heuristic ML). Percent ID to " hypothetical protein LOC393433 [Danio rerio]" (NP_956755.1) is 39.11% over 427 BLAST alignment positions. 308 of 696 Muscle alignment positions masked (44.200 %; 388 positions used for tree generation) with a Muscle scorefile cutoff of 25.
###Gene_Info_Comments GLEAN3_18372 ###
BLAST identities confined to RefSeq only. Name assigned by phylogenetic methods (ME tree with ML branch lengths or heuristic ML). Percent ID to " cytochrome P450, family 2, subfamily U, polypeptide 1 [Homo sapiens]" (NP_898898.1) is 37.80% over 508 BLAST alignment positions. 398 of 845 Muscle alignment positions masked (47.100 %; 447 positions used for tree generation) with a Muscle scorefile cutoff of 25.
###Gene_Info_Comments GLEAN3_18468 ###
BLAST identities confined to RefSeq only. Name assigned by phylogenetic methods (ME tree with ML branch lengths or heuristic ML). Percent ID to " hypothetical protein LOC393108 [Danio rerio]" (NP_956433.1) is 35.41% over 466 BLAST alignment positions. 35 of 514 Muscle alignment positions masked (6.800 %; 479 positions used for tree generation) with a Muscle scorefile cutoff of 25.
###Gene_Info_Comments GLEAN3_00279 ###
BLAST identities confined to RefSeq only. Name assigned by phylogenetic methods (ME tree with ML branch lengths or heuristic ML). Percent ID to " cytochrome P450, family 17, subfamily A, polypeptide 1 [Danio rerio]" (NP_997971.1) is 44.06% over 202 BLAST alignment positions. 1620 of 2057 Muscle alignment positions masked (78.700 %; 437 positions used for tree generation) with a Muscle scorefile cutoff of 25.   COMMENTS from arm@stowers-institute.org:  partial, missing N-terminus
###Gene_Info_Comments GLEAN3_24836 ###
possible non-coding exon included in glean annotation
###Gene_Info_Comments GLEAN3_14896 ###
No signal in tiling array or EST.  May be pseudogene or adult only expression.
###Gene_Info_Comments GLEAN3_00645 ###
BLAST identities confined to RefSeq only. Name assigned by phylogenetic methods (ME tree with ML branch lengths or heuristic ML). Percent ID to " cytochrome P450, family 27, subfamily B, polypeptide 1 [Xenopus tropicalis]" (NP_001006907.1) is 39.75% over 483 BLAST alignment positions. 494 of 938 Muscle alignment positions masked (52.600 %; 444 positions used for tree generation) with a Muscle scorefile cutoff of 25.
###Gene_Info_Comments GLEAN3_00746 ###
BLAST identities confined to RefSeq only. Name assigned by phylogenetic methods (ME tree with ML branch lengths or heuristic ML). Percent ID to " cytochrome P450, family 2, subfamily U, polypeptide 1 [Homo sapiens]" (NP_898898.1) is 35.25% over 278 BLAST alignment positions. 372 of 798 Muscle alignment positions masked (46.600 %; 426 positions used for tree generation) with a Muscle scorefile cutoff of 25.   COMMENTS from arm@stowers-institute.org:  partial, missing N- and C-terminus
###Gene_Info_Comments GLEAN3_01622 ###
BLAST identities confined to RefSeq only. Name assigned by phylogenetic methods (ME tree with ML branch lengths or heuristic ML). Percent ID to " cytochrome P450, subfamily IID (debrisoquine, sparteine, etc., -metabolizing), polypeptide 6 [Bos taurus]" (NP_776954.1) is 24.95% over 477 BLAST alignment positions. 65 of 554 Muscle alignment positions masked (11.700 %; 489 positions used for tree generation) with a Muscle scorefile cutoff of 25.
###Gene_Info_Comments GLEAN3_01792 ###
BLAST identities confined to RefSeq only. Name assigned by phylogenetic methods (ME tree with ML branch lengths or heuristic ML). Percent ID to " MGC107863 protein [Xenopus tropicalis]" (NP_001015719.1) is 39.95% over 398 BLAST alignment positions. 262 of 726 Muscle alignment positions masked (36.000 %; 464 positions used for tree generation) with a Muscle scorefile cutoff of 25.   COMMENTS from arm@stowers-institute.org:  missing some N-terminus residues
###Gene_Info_Comments GLEAN3_02371 ###
BLAST identities confined to RefSeq only. Name assigned by phylogenetic methods (ME tree with ML branch lengths or heuristic ML). Percent ID to " cytochrome P450, family 2, subfamily j, polypeptide 6 [Mus musculus]" (NP_034138.2) is 30.14% over 554 BLAST alignment positions. 416 of 879 Muscle alignment positions masked (47.300 %; 463 positions used for tree generation) with a Muscle scorefile cutoff of 25.
###Gene_Info_Comments GLEAN3_02380 ###
BLAST identities confined to RefSeq only. Name assigned by phylogenetic methods (ME tree with ML branch lengths or heuristic ML). Percent ID to " cytochrome P450, family 4, subfamily v [Gallus gallus]" (NP_001001879.1) is 53.59% over 362 BLAST alignment positions. 753 of 1162 Muscle alignment positions masked (64.800 %; 409 positions used for tree generation) with a Muscle scorefile cutoff of 25.   COMMENTS from arm@stowers-institute.org:  partial, missing N-terminus
###Gene_Info_Comments GLEAN3_02590 ###
BLAST identities confined to RefSeq only. Name assigned by phylogenetic methods (ME tree with ML branch lengths or heuristic ML). Percent ID to " hypothetical protein LOC613109 [Xenopus tropicalis]" (NP_001027517.1) is 43.27% over 342 BLAST alignment positions. 576 of 968 Muscle alignment positions masked (59.500 %; 392 positions used for tree generation) with a Muscle scorefile cutoff of 25.   COMMENTS from arm@stowers-institute.org:  partial, missing N-terminus
###Gene_Info_Comments GLEAN3_02656 ###
BLAST identities confined to RefSeq only. Name assigned by phylogenetic methods (ME tree with ML branch lengths or heuristic ML). Percent ID to " MGC107863 protein [Xenopus tropicalis]" (NP_001015719.1) is 38.43% over 458 BLAST alignment positions. 202 of 653 Muscle alignment positions masked (30.900 %; 451 positions used for tree generation) with a Muscle scorefile cutoff of 25.
###Gene_Info_Comments GLEAN3_02658 ###
BLAST identities confined to RefSeq only. Name assigned by phylogenetic methods (ME tree with ML branch lengths or heuristic ML). Percent ID to " CYP2A13 protein [Xenopus tropicalis]" (NP_001010998.1) is 34.52% over 478 BLAST alignment positions. 117 of 569 Muscle alignment positions masked (20.500 %; 452 positions used for tree generation) with a Muscle scorefile cutoff of 25.
###Gene_Info_Comments GLEAN3_02660 ###
BLAST identities confined to RefSeq only. Name assigned by phylogenetic methods (ME tree with ML branch lengths or heuristic ML). Percent ID to " hypothetical protein LOC613109 [Xenopus tropicalis]" (NP_001027517.1) is 42.62% over 298 BLAST alignment positions. 533 of 935 Muscle alignment positions masked (57.000 %; 402 positions used for tree generation) with a Muscle scorefile cutoff of 25.
###Gene_Info_Comments GLEAN3_02831 ###
BLAST identities confined to RefSeq only. Name assigned by phylogenetic methods (ME tree with ML branch lengths or heuristic ML). Percent ID to " cytochrome P450, family 2, subfamily b, polypeptide 19 [Mus musculus]" (NP_031840.1) is 35.96% over 178 BLAST alignment positions. 84 of 548 Muscle alignment positions masked (15.300 %; 464 positions used for tree generation) with a Muscle scorefile cutoff of 25.   COMMENTS from arm@stowers-institute.org:  partial
###Gene_Info_Comments GLEAN3_02832 ###
BLAST identities confined to RefSeq only. Name assigned by phylogenetic methods (ME tree with ML branch lengths or heuristic ML). Percent ID to " cytochrome P450, family 4, subfamily v [Gallus gallus]" (NP_001001879.1) is 54.14% over 423 BLAST alignment positions. 512 of 927 Muscle alignment positions masked (55.200 %; 415 positions used for tree generation) with a Muscle scorefile cutoff of 25.   COMMENTS from arm@stowers-institute.org:  partial, missing N-terminus
###Gene_Info_Comments GLEAN3_02884 ###
BLAST identities confined to RefSeq only. Name assigned by phylogenetic methods (ME tree with ML branch lengths or heuristic ML). Percent ID to " cytochrome P450, family 26, subfamily b, polypeptide 1 [Danio rerio]" (NP_997831.1) is 37.76% over 241 BLAST alignment positions. 267 of 709 Muscle alignment positions masked (37.600 %; 442 positions used for tree generation) with a Muscle scorefile cutoff of 25.   COMMENTS from arm@stowers-institute.org:  partial, missing N-terminus
###Gene_Info_Comments GLEAN3_02898 ###
BLAST identities confined to RefSeq only. Name assigned by phylogenetic methods (ME tree with ML branch lengths or heuristic ML). Percent ID to " cytochrome P450, family 2, subfamily d, polypeptide 9 [Rattus norvegicus]" (NP_695225.1) is 25.47% over 475 BLAST alignment positions. 287 of 759 Muscle alignment positions masked (37.800 %; 472 positions used for tree generation) with a Muscle scorefile cutoff of 25.
###Gene_Info_Comments GLEAN3_03038 ###
BLAST identities confined to RefSeq only. Name assigned by phylogenetic methods (ME tree with ML branch lengths or heuristic ML). Percent ID to " cytochrome P450, family 1, subfamily A, polypeptide 1 [Gallus gallus]" (NP_990477.1) is 36.04% over 455 BLAST alignment positions. 949 of 1435 Muscle alignment positions masked (66.100 %; 486 positions used for tree generation) with a Muscle scorefile cutoff of 25.
###Gene_Info_Comments GLEAN3_03231 ###
BLAST identities confined to RefSeq only. Name assigned by phylogenetic methods (ME tree with ML branch lengths or heuristic ML). Percent ID to " cytochrome P450 1A1 [Sus scrofa]" (NP_999577.1) is 33.88% over 487 BLAST alignment positions. 163 of 595 Muscle alignment positions masked (27.300 %; 432 positions used for tree generation) with a Muscle scorefile cutoff of 25.
###Gene_Info_Comments GLEAN3_03232 ###
BLAST identities confined to RefSeq only. Name assigned by phylogenetic methods (ME tree with ML branch lengths or heuristic ML). Percent ID to " cytochrome P450, family 1, subfamily a, polypeptide 1 [Rattus norvegicus]" (NP_036672.2) is 34.65% over 456 BLAST alignment positions. 183 of 617 Muscle alignment positions masked (29.600 %; 434 positions used for tree generation) with a Muscle scorefile cutoff of 25.
###Gene_Info_Comments GLEAN3_03606 ###
BLAST identities confined to RefSeq only. Name assigned by phylogenetic methods (ME tree with ML branch lengths or heuristic ML). Percent ID to " cytochrome P450, family 2, subfamily J, polypeptide 2, B [Danio rerio]" (NP_956914.1) is 38.82% over 492 BLAST alignment positions. 281 of 755 Muscle alignment positions masked (37.200 %; 474 positions used for tree generation) with a Muscle scorefile cutoff of 25.   Original GLEAN3 predictiona interrupted by non-LTR retrotransposon (GLEAN3_03604) and AP1 (GLEAN3_03603) possibly also from phage event.
COMMENTS from arm@stowers-institute.org:  extra N-terminus in original GLEAN model
###Gene_Info_Comments GLEAN3_03607 ###
BLAST identities confined to RefSeq only. Name assigned by phylogenetic methods (ME tree with ML branch lengths or heuristic ML). Percent ID to " cytochrome P450, family 2, subfamily j, polypeptide 9 [Rattus norvegicus]" (NP_786942.1) is 40.05% over 382 BLAST alignment positions. 269 of 736 Muscle alignment positions masked (36.500 %; 467 positions used for tree generation) with a Muscle scorefile cutoff of 25.   COMMENTS from arm@stowers-institute.org:  partial, missing N-terminus
###Gene_Info_Comments GLEAN3_03687 ###
BLAST identities confined to RefSeq only. Name assigned by phylogenetic methods (ME tree with ML branch lengths or heuristic ML). Percent ID to " hypothetical protein LOC613109 [Xenopus tropicalis]" (NP_001027517.1) is 33.94% over 330 BLAST alignment positions. 317 of 697 Muscle alignment positions masked (45.400 %; 380 positions used for tree generation) with a Muscle scorefile cutoff of 25.   COMMENTS from arm@stowers-institute.org:  partial
###Gene_Info_Comments GLEAN3_05160 ###
BLAST identities confined to RefSeq only. Name assigned by phylogenetic methods (ME tree with ML branch lengths or heuristic ML). Percent ID to " hypothetical protein LOC572000 [Danio rerio]" (NP_001020728.1) is 33.69% over 466 BLAST alignment positions. 200 of 651 Muscle alignment positions masked (30.700 %; 451 positions used for tree generation) with a Muscle scorefile cutoff of 25.
###Gene_Info_Comments GLEAN3_05439 ###
BLAST identities confined to RefSeq only. Name assigned by phylogenetic methods (ME tree with ML branch lengths or heuristic ML). Percent ID to " hypothetical protein LOC613109 [Xenopus tropicalis]" (NP_001027517.1) is 42.06% over 504 BLAST alignment positions. 244 of 646 Muscle alignment positions masked (37.700 %; 402 positions used for tree generation) with a Muscle scorefile cutoff of 25.
###Gene_Info_Comments GLEAN3_05655 ###
BLAST identities confined to RefSeq only. Name assigned by phylogenetic methods (ME tree with ML branch lengths or heuristic ML). Percent ID to " cytochrome P450, family 2, subfamily J, polypeptide 2, B [Danio rerio]" (NP_956914.1) is 31.26% over 531 BLAST alignment positions. 192 of 658 Muscle alignment positions masked (29.100 %; 466 positions used for tree generation) with a Muscle scorefile cutoff of 25.
###Gene_Info_Comments GLEAN3_05668 ###
BLAST identities confined to RefSeq only. Name assigned by phylogenetic methods (ME tree with ML branch lengths or heuristic ML). Percent ID to " cytochrome P450, family 2, subfamily U, polypeptide 1 [Homo sapiens]" (NP_898898.1) is 34.34% over 530 BLAST alignment positions. 230 of 693 Muscle alignment positions masked (33.100 %; 463 positions used for tree generation) with a Muscle scorefile cutoff of 25.
###Gene_Info_Comments GLEAN3_05931 ###
BLAST identities confined to RefSeq only. Name assigned by phylogenetic methods (ME tree with ML branch lengths or heuristic ML). Percent ID to " cytochrome P450 4F6 [Rattus norvegicus]" (NP_695230.1) is 36.55% over 435 BLAST alignment positions. 681 of 1195 Muscle alignment positions masked (56.900 %; 514 positions used for tree generation) with a Muscle scorefile cutoff of 25.   COMMENTS from arm@stowers-institute.org:  extra C-terminus
###Gene_Info_Comments GLEAN3_06574 ###
BLAST identities confined to RefSeq only. Name assigned by phylogenetic methods (ME tree with ML branch lengths or heuristic ML). Percent ID to " MGC107863 protein [Xenopus tropicalis]" (NP_001015719.1) is 39.67% over 484 BLAST alignment positions. 308 of 772 Muscle alignment positions masked (39.800 %; 464 positions used for tree generation) with a Muscle scorefile cutoff of 25.
###Gene_Info_Comments GLEAN3_07306 ###
BLAST identities confined to RefSeq only. Name assigned by phylogenetic methods (ME tree with ML branch lengths or heuristic ML). Percent ID to " cytochrome P450, family 2, subfamily U, polypeptide 1 [Homo sapiens]" (NP_898898.1) is 37.73% over 440 BLAST alignment positions. 182 of 638 Muscle alignment positions masked (28.500 %; 456 positions used for tree generation) with a Muscle scorefile cutoff of 25.   COMMENTS from arm@stowers-institute.org:  missing N-terminus
###Gene_Info_Comments GLEAN3_07335 ###
BLAST identities confined to RefSeq only. Name assigned by phylogenetic methods (ME tree with ML branch lengths or heuristic ML). Percent ID to " cytochrome P450 CYP4F18 [Mus musculus]" (NP_077764.1) is 45.73% over 199 BLAST alignment positions. 658 of 1166 Muscle alignment positions masked (56.400 %; 508 positions used for tree generation) with a Muscle scorefile cutoff of 25.   COMMENTS from arm@stowers-institute.org:  partial, missing N-terminus
###Gene_Info_Comments GLEAN3_07409 ###
BLAST identities confined to RefSeq only. Name assigned by phylogenetic methods (ME tree with ML branch lengths or heuristic ML). Percent ID to " cytochrome P450, family 17, subfamily A, polypeptide 1 [Danio rerio]" (NP_997971.1) is 35.33% over 450 BLAST alignment positions. 127 of 571 Muscle alignment positions masked (22.200 %; 444 positions used for tree generation) with a Muscle scorefile cutoff of 25.
###Gene_Info_Comments GLEAN3_07558 ###
BLAST identities confined to RefSeq only. Name assigned by phylogenetic methods (ME tree with ML branch lengths or heuristic ML). Percent ID to " cytochrome P450 4F6 [Rattus norvegicus]" (NP_695230.1) is 30.44% over 450 BLAST alignment positions. 579 of 1091 Muscle alignment positions masked (53.000 %; 512 positions used for tree generation) with a Muscle scorefile cutoff of 25.   COMMENTS from arm@stowers-institute.org:  partial, missing C-terminus
###Gene_Info_Comments GLEAN3_08152 ###
BLAST identities confined to RefSeq only. Name assigned by phylogenetic methods (ME tree with ML branch lengths or heuristic ML). Percent ID to " MGC107863 protein [Xenopus tropicalis]" (NP_001015719.1) is 40.68% over 440 BLAST alignment positions. 273 of 739 Muscle alignment positions masked (36.900 %; 466 positions used for tree generation) with a Muscle scorefile cutoff of 25.
###Gene_Info_Comments GLEAN3_08632 ###
Most CYP2-like urchin sequence. BLAST identities confined to RefSeq only. Name assigned by phylogenetic methods (ME tree with ML branch lengths or heuristic ML). Percent ID to " cytochrome P450, family 2, subfamily U, polypeptide 1 [Homo sapiens]" (NP_898898.1) is 41.78% over 450 BLAST alignment positions. 403 of 859 Muscle alignment positions masked (46.900 %; 456 positions used for tree generation) with a Muscle scorefile cutoff of 25.
###Gene_Info_Comments GLEAN3_09118 ###
BLAST identities confined to RefSeq only. Name assigned by phylogenetic methods (ME tree with ML branch lengths or heuristic ML). Percent ID to " cytochrome P450 2AD3 [Danio rerio]" (NP_001020725.1) is 45.61% over 171 BLAST alignment positions. 379 of 822 Muscle alignment positions masked (46.100 %; 443 positions used for tree generation) with a Muscle scorefile cutoff of 25.
###Gene_Info_Comments GLEAN3_09512 ###
BLAST identities confined to RefSeq only. Name assigned by phylogenetic methods (ME tree with ML branch lengths or heuristic ML). Percent ID to " MGC107863 protein [Xenopus tropicalis]" (NP_001015719.1) is 42.69% over 431 BLAST alignment positions. 210 of 690 Muscle alignment positions masked (30.400 %; 480 positions used for tree generation) with a Muscle scorefile cutoff of 25.   COMMENTS from arm@stowers-institute.org:  missing N-terminus
###Gene_Info_Comments GLEAN3_09692 ###
BLAST identities confined to RefSeq only. Name assigned by phylogenetic methods (ME tree with ML branch lengths or heuristic ML). Percent ID to " MGC107863 protein [Xenopus tropicalis]" (NP_001015719.1) is 40.48% over 294 BLAST alignment positions. 221 of 692 Muscle alignment positions masked (31.900 %; 471 positions used for tree generation) with a Muscle scorefile cutoff of 25.   COMMENTS from arm@stowers-institute.org:  partial, missing C-terminus
###Gene_Info_Comments GLEAN3_09825 ###
BLAST identities confined to RefSeq only. Name assigned by phylogenetic methods (ME tree with ML branch lengths or heuristic ML). Percent ID to " cytochrome P450, family 2, subfamily U, polypeptide 1 [Homo sapiens]" (NP_898898.1) is 38.85% over 296 BLAST alignment positions. 302 of 737 Muscle alignment positions masked (40.900 %; 435 positions used for tree generation) with a Muscle scorefile cutoff of 25.   COMMENTS from arm@stowers-institute.org:  partial, missing N-terminus
###Gene_Info_Comments GLEAN3_10143 ###
BLAST identities confined to RefSeq only. Name assigned by phylogenetic methods (ME tree with ML branch lengths or heuristic ML). Percent ID to " cytochrome P450, family 2, subfamily U, polypeptide 1 [Homo sapiens]" (NP_898898.1) is 38.30% over 376 BLAST alignment positions. 264 of 730 Muscle alignment positions masked (36.100 %; 466 positions used for tree generation) with a Muscle scorefile cutoff of 25.   COMMENTS from arm@stowers-institute.org:  missing N-terminus
###Gene_Info_Comments GLEAN3_10246 ###
BLAST identities confined to RefSeq only. Name assigned by phylogenetic methods (ME tree with ML branch lengths or heuristic ML). Percent ID to " cytochrome P450, family 2, subfamily U, polypeptide 1 [Homo sapiens]" (NP_898898.1) is 38.19% over 199 BLAST alignment positions. 187 of 653 Muscle alignment positions masked (28.600 %; 466 positions used for tree generation) with a Muscle scorefile cutoff of 25.   COMMENTS from arm@stowers-institute.org:  partial, missing N-terminus
###Gene_Info_Comments GLEAN3_10563 ###
BLAST identities confined to RefSeq only. Name assigned by phylogenetic methods (ME tree with ML branch lengths or heuristic ML). Percent ID to " cytochrome P450, family 2, subfamily J, polypeptide 2 [Homo sapiens]" (NP_000766.2) is 36.23% over 494 BLAST alignment positions. 238 of 714 Muscle alignment positions masked (33.300 %; 476 positions used for tree generation) with a Muscle scorefile cutoff of 25.
###Gene_Info_Comments GLEAN3_10576 ###
BLAST identities confined to RefSeq only. Name assigned by phylogenetic methods (ME tree with ML branch lengths or heuristic ML). Percent ID to " cytochrome P450, family 2, subfamily J, polypeptide 2, B [Danio rerio]" (NP_956914.1) is 40.34% over 471 BLAST alignment positions. 76 of 553 Muscle alignment positions masked (13.700 %; 477 positions used for tree generation) with a Muscle scorefile cutoff of 25.
###Gene_Info_Comments GLEAN3_12080 ###
BLAST identities confined to RefSeq only. Name assigned by phylogenetic methods (ME tree with ML branch lengths or heuristic ML). Percent ID to " cytochrome P450, family 4, subfamily v, polypeptide 2 [Homo sapiens]" (NP_997235.2) is 50.75% over 469 BLAST alignment positions. 541 of 957 Muscle alignment positions masked (56.500 %; 416 positions used for tree generation) with a Muscle scorefile cutoff of 25.
###Gene_Info_Comments GLEAN3_12081 ###
BLAST identities confined to RefSeq only. Name assigned by phylogenetic methods (ME tree with ML branch lengths or heuristic ML). Percent ID to " hypothetical protein LOC510156 [Bos taurus]" (NP_001029545.1) is 51.39% over 469 BLAST alignment positions. 520 of 940 Muscle alignment positions masked (55.300 %; 420 positions used for tree generation) with a Muscle scorefile cutoff of 25.
###Gene_Info_Comments GLEAN3_12148 ###
Assembly error: see GLEAN3_25595 Percent ID to " cytochrome P450, family 51 [Strongylocentrotus purpuratus]" (NP_001001906.1) is 98.59% over 142 BLAST alignment positions.
###Gene_Info_Comments GLEAN3_12500 ###
BLAST identities confined to RefSeq only. Name assigned by phylogenetic methods (ME tree with ML branch lengths or heuristic ML). Percent ID to " cytochrome P450, family 2, subfamily J, polypeptide 2, B [Danio rerio]" (NP_956914.1) is 38.26% over 311 BLAST alignment positions. 128 of 591 Muscle alignment positions masked (21.600 %; 463 positions used for tree generation) with a Muscle scorefile cutoff of 25.   COMMENTS from arm@stowers-institute.org:  partial, missing N-terminus, missing stretch in middle
###Gene_Info_Comments GLEAN3_12926 ###
BLAST identities confined to RefSeq only. Name assigned by phylogenetic methods (ME tree with ML branch lengths or heuristic ML). Percent ID to " hypothetical protein LOC553543 [Danio rerio]" (NP_001018358.1) is 33.33% over 312 BLAST alignment positions. 116 of 584 Muscle alignment positions masked (19.800 %; 468 positions used for tree generation) with a Muscle scorefile cutoff of 25.   COMMENTS from arm@stowers-institute.org:  partial, missing C-terminus
###Gene_Info_Comments GLEAN3_13039 ###
BLAST identities confined to RefSeq only. Name assigned by phylogenetic methods (ME tree with ML branch lengths or heuristic ML). Percent ID to " MGC107863 protein [Xenopus tropicalis]" (NP_001015719.1) is 41.07% over 431 BLAST alignment positions. 233 of 700 Muscle alignment positions masked (33.200 %; 467 positions used for tree generation) with a Muscle scorefile cutoff of 25.
###Gene_Info_Comments GLEAN3_13199 ###
BLAST identities confined to RefSeq only. Name assigned by phylogenetic methods (ME tree with ML branch lengths or heuristic ML). Percent ID to " cytochrome P450, family 2, subfamily J, polypeptide 4 [Rattus norvegicus]" (NP_075414.2) is 27.34% over 523 BLAST alignment positions. 307 of 771 Muscle alignment positions masked (39.800 %; 464 positions used for tree generation) with a Muscle scorefile cutoff of 25.
###Gene_Info_Comments GLEAN3_14092 ###
BLAST identities confined to RefSeq only. Name assigned by phylogenetic methods (ME tree with ML branch lengths or heuristic ML). Percent ID to " thromboxane A synthase 1 [Danio rerio]" (NP_991172.1) is 36.29% over 463 BLAST alignment positions. 80 of 574 Muscle alignment positions masked (13.900 %; 494 positions used for tree generation) with a Muscle scorefile cutoff of 25.
###Gene_Info_Comments GLEAN3_08801 ###
The predicted amino acid sequence of GLEAN3_08801 is identical to those of GLEAN3_08799 and GLEAN3_08800.
Because there is a good chance that they do not represent three distinct gene products, they were named as follows:
GLEAN3_08799: Sp-Dynein Light Chain-2-3a
GLEAN3_08800: Sp-Dynein Light Chain-2-3b
GLEAN3_08801: Sp-Dynein Light Chain-2-3c
###Gene_Info_Comments GLEAN3_27937 ###
The predicted amino terminal 43 amino acids of GLEAN3_27937 are inconsistent with the comparable regions of oher Dynein Light Chain Type 2 sequences, and are almost certainly incorrect; otherwise, this GLEAN matches the neighboring GLEAN3_27938 very well.
Because these may not represent independent gene products, They were named as follows:
GLEAN3_27937: Sp-Dynein Light Chain-2-6a
GLEAN3_27938: Sp-Dynein Light Chain-2-6b
###Gene_Info_Comments GLEAN3_27938 ###
The predicted amino terminal 50 amino acids of GLEAN3_27938 are inconsistent with the comparable regions of oher Dynein Light Chain Type 2 sequences, and are almost certainly incorrect; otherwise, this GLEAN matches the neighboring GLEAN3_27937 very well.
Because these may not represent independent gene products, They were named as follows:
GLEAN3_27937: Sp-Dynein Light Chain-2-6a
GLEAN3_27938: Sp-Dynein Light Chain-2-6b
###Gene_Info_Comments GLEAN3_26507 ###
five EGF - GPS - one TM segment
Apart from absence of a complete tm_2 domain looks like a member of the LNB-7TM-GPCR subfamily
###Gene_Info_Comments GLEAN3_18579 ###
LDLa - 4-5 EGF - Igc2 - GPS x2 - NO TM predicted

Apart form lack of 7tm_2 domain looks like meber of LNB-TM7-GPCR subfamily
C-terminal end probably misassembled
###Gene_Info_Comments GLEAN3_13084 ###
SR - FA58C X3 - GPS - 7tm_2
matches well with LNB-7TM-GPCR subfamily

NOVEL ARCHITECTURE
SIMILAR DOMAIN COMPOSITION TO GLEAN3_20525
###Gene_Info_Comments GLEAN3_01671 ###
SR x4 - EGF - GPS - three TM segments
Apart from lack of a complet 7tm_2 domain, looks like a member of LNB-7TM-GPCR subfamily

Novel architecture

###Gene_Info_Comments GLEAN3_01056 ###
Gal-lectin/GPS/7tm_2

a bit like latrophilin but lacks OLF domain
###Gene_Info_Comments GLEAN3_03929 ###
Gal-lectin/HormR/GPS/7tm_2

a bit like latrophilin but lacks OLF domain
###Gene_Info_Comments GLEAN3_20012 ###
LamNT/EGF_Lam-8/FN3 x3

likely N-terminus of usherin2A - mammalian forms have N-terminal LamGL domain in addition

C-terminus encoded in GLEAN3_17733

Previously vertebrate-restricted
###Gene_Info_Comments GLEAN3_17733 ###
Large ECM or surface membrane protein with LamG and lots of FN3 repeats - homologous with the Usherin2A protein of vertebrates.
Missing N-terminal part - LamNT AND EGF-Lam domains - probably encoded in GLEAN3_20012

Previously vertebrate-restricted
###Gene_Info_Comments GLEAN3_04543 ###
Using HsSERPINE2 in BLASTP search of the GLEAN3 models yielded 15 genes. Each was analyzed by SMART and the top 12 hits were found to have a SERPIN domain and were annonated Sp-serpin-like 3 through 12 (2 were already annotated by A.Cameron). BLASTP searches of the GLEAN3 models with other vertebrate Serpins yielded the same top 12 hits.
This predicted protein had 2 serpin domains. This glean model could be a fusion of two genes. The Fgenesh predictions show two separate genes. There are very few serpins with 2 serpin domains. Some were found in ciona, rat and chicken.
###Gene_Info_Comments GLEAN3_05904 ###
SR/SO/7TM_2

Novel architecture
###Gene_Info_Comments GLEAN3_14551 ###
SO - 7tm_2

Novel architecture
###Gene_Info_Comments GLEAN3_13142 ###
four SO repeats
###Gene_Info_Comments GLEAN3_16249 ###
two SO repeats
###Gene_Info_Comments GLEAN3_18658 ###
three SO repeats
###Gene_Info_Comments GLEAN3_27632 ###
two SO repeats
###Gene_Info_Comments GLEAN3_02711 ###
The gene prediction could be incomplete. It is at the end of a sort scaffold.
Using HsSERPINE2 in BLASTP search of the GLEAN3 models yielded 15 genes. Each was analyzed by SMART and the top 12 hits were found to have a SERPIN domain and were annonated Sp-serpin-like 3 through 12 (2 were already annotated by A.Cameron). BLASTP searches of the GLEAN3 models with other vertebrate Serpins yielded the same top 12 hits.
###Gene_Info_Comments GLEAN3_07225 ###
homolog of Tubulointerstitial nephritis antigen - 
SO-cysteine-type endopeptidase - peptidase domain scores as inactive as in human homolog

Widely distributed phylogenetically
###Gene_Info_Comments GLEAN3_13734 ###
FA58C/CLECT/EGF-Ca/CUB x3
###Gene_Info_Comments GLEAN3_15863 ###
SEMA/PSI/IPT x2

looks like a plexin - could be a c-met-related RTK
###Gene_Info_Comments GLEAN3_26514 ###
SEMA/PSI/IPT x2

looks like a plexin - could be a c-met-related RTK
###Gene_Info_Comments GLEAN3_19919 ###
SEMA/PSI
could be a semaphorin or plexin or a c-met relative
###Gene_Info_Comments GLEAN3_26431 ###
SEMA/PSI
could be a semaphorin or plexin or a c-met relative
###Gene_Info_Comments GLEAN3_24673 ###
SEMA/PSI
could be a semaphorin or plexin or a c-met relative
###Gene_Info_Comments GLEAN3_18779 ###
SEMA/PSI
could be a semaphorin or plexin or a c-met relative
###Gene_Info_Comments GLEAN3_06483 ###
SEMA/PSI
could be a semaphorin or plexin or a c-met relative
###Gene_Info_Comments GLEAN3_12034 ###
PSI/IPT x3

looks like a fragment of plexin - maybe c-met family
###Gene_Info_Comments GLEAN3_13447 ###
PSI/IPT x3/TM
###Gene_Info_Comments GLEAN3_22072 ###
same as GLEAN3_12389

adjacent to cleavage histone H2a  GLEAN3_22075
###Gene_Info_Comments GLEAN3_12389 ###
THis is a duplicate of GLEAN3_22072 and likely should be deleted
###Gene_Info_Comments GLEAN3_25514 ###
This gene is on three scaffolds (544, 81593 and 56300). On scaffold544 (exon 1) there is one GLEAN model (GLEAN3_25514) which covers exon1-3 for this gene. On scaffold81593 GLEAN_14036 is predicted (exon 2-15)for this gene. For scaffold 56300, gene prediction is GLEAN3_17882(exon 8-24) for this gene. Exon 2 and 3 are overlapped between two scaffolds (544 and 81593) and exon 8-15 are overlapped between scaffolds 81593 and 56300. Please refer to GLEAN_14036 and GLEAN_17882 for gene features for C-terminl portion.
###Gene_Info_Comments GLEAN3_22075 ###
Adjacent to Glean3_22072 cleavage stage histone H4
###Gene_Info_Comments GLEAN3_24567 ###
simialr to oocyte specific human and mouse H1oo;  there are no other known cleavage stage orthologues of the histones in mammals.
###Gene_Info_Comments GLEAN3_26501 ###
This is a fragment of the H2aZ/x gene or a pseudogene; should be deleted
###Gene_Info_Comments GLEAN3_24890 ###
histone H2aZ of mammals  
###Gene_Info_Comments GLEAN3_14432 ###
There are two macro histone H2a isoforms in the urchin genome.  The other one is GLEAN3_15435
###Gene_Info_Comments GLEAN3_15435 ###
there is a second isoform or macro H2a in the urchin genome  GLEAN3_14432
###Gene_Info_Comments GLEAN3_11356 ###
This is an internal duplicate of GLEAN3_04671; this sequence contains a stretch of possibly spurious amino acids, absent from 04671. 
See protein:
MFNHLSYMRPRDVQQREFEVLLKLNHKNIIHLDQIEEDQISKQPVIVMELCTGGSLYTYLDTPENLYGLKEKEFLQVLNDVSAGMKHLRDKGIVHRDIKPGNIMMVKGEDGVIVYKLADFGAARELEDDEAFKSLYGTEEYL

{LADFGAARELEAEAFKSLYGTEKYL} spurious?

HPDLYERAVLGKRNRTEFTAQTDLWSLGVTFFHTATGSLPFRAHGGRNNREVMHNITTTKKSSMISGVQEVPNGPIIWSEDLPKTCHFSPDMRHQVKTLLSHTMECNIQKMWTFDIFFDFVQRLTSMRPVDVFVAPTCECHIIYVEPQRKVVDFQDKMAHLTGIQPDTQSLLWDKEVFDLKKCLKCEDLPQTSPDNPLILVRMGAEVFPSVKVPSLR
###Gene_Info_Comments GLEAN3_03438 ###
this is likely to be part of the same gene as 03534
###Gene_Info_Comments GLEAN3_12899 ###
Overlaps with GLEAN3_22035: the N terminal part of this glean is identical to 22035, then the C terminus diverges and is longer than the C term of 22035.
###Gene_Info_Comments GLEAN3_22035 ###
Overlaps with GLEAN3_12899. The N term of this protein is longer. 12899 is an internal identical match over its N terminal portion, but the proteins diverge in their C termini.
###Gene_Info_Comments GLEAN3_05514 ###
overlapping part of GLEAN3_27573
###Gene_Info_Comments GLEAN3_08172 ###
contains RRM (RNA Recognition motif)
###Gene_Info_Comments GLEAN3_14468 ###
LY4-LDLA5
###Gene_Info_Comments GLEAN3_06165 ###
LDLA/EGF2/LY5/EGF/MULTIPLE LDLA/EGF2/LY2/EGF2
###Gene_Info_Comments GLEAN3_05642 ###
LY/LDLA/LY4/EGF4/CCP - likely a fragment

novel architecture - LRPs in other species do not have CCP
###Gene_Info_Comments GLEAN3_20978 ###
LDLA2/EGF2/LY3/EGF/LY6/EGF/MULTIPLE LDLA/EGF2/LY5/EGF/LY4/EGF.
LY5/EGF/LY5/EGF/MULTIPLE LDLA/EGF/LY5/EGF/MULTIPLE LDLA/EGF..
LY3/EGF6-TM
###Gene_Info_Comments GLEAN3_24306 ###
very similar to GLEAN3_08172, contains 2 RRM motifs 
###Gene_Info_Comments GLEAN3_07072 ###
multiple LDLA/EGF2/LY3/EGF-TM
###Gene_Info_Comments GLEAN3_22132 ###
LDLA/EGF2/LY5/EGF/multiple LDLA/EGF2/LY3/EGF2-TM
###Gene_Info_Comments GLEAN3_09930 ###
LDLA3/EGF2/LY4 - TM
###Gene_Info_Comments GLEAN3_25837 ###
EGF/LY/EGF/LY2/EGF2/LY/LDLA/LY4/EGF2/CCP - TM

novel architecture - LRPs in other species do not have CCP
###Gene_Info_Comments GLEAN3_27625 ###
LY/LDLA/LY4/EGF/CCP - likely a fragment

novel architecture - LRPs in other species do not have CCP
###Gene_Info_Comments GLEAN3_21270 ###
highly similar to Glean3_17749
###Gene_Info_Comments GLEAN3_17749 ###
Highly similar to Glean3_21270
###Gene_Info_Comments GLEAN3_25763 ###
DNA mismatch repair: The Escherichia coli MutHLS system has been highly conserved throughout evolution. The eukaryotic pathway results in a specialization of MutS homologs that have evolved to play crucial roles in both DNA mismatch repair and meiotic recombination. In Saccharomyces cerevisiae, MSH4 (MutS homolog 4) is a meiosis-specific protein that is not involved in mismatch correction. This protein is required for reciprocal recombination and proper segregation of homologous chromosomes at meiosis I. Paquis-Flucklinger et al identified the human MSH4 homolog gene. The predicted amino acid sequence shows 28.7% identity with the S. cerevisiae MSH4 protein 
###Gene_Info_Comments GLEAN3_12547 ###
contains SAM (sterile alpha motif) repeats; Glean3_26761 is a partial sequence of this one (different scaffolds)
###Gene_Info_Comments GLEAN3_27921 ###
The S. cerevisiae DMC1 gene is essential for meiotic recombination. Its encoded protein is structurally and evolutionally related to the products of the yeast RAD51 and E. coli RecA genes. The ovary is one of the high-expression sites for this gene.
###Gene_Info_Comments GLEAN3_26761 ###
partial sequence of Glean3_12547 (different scaffolds)
###Gene_Info_Comments GLEAN3_17127 ###
SRCR(5). Possibly incomplete.
###Gene_Info_Comments GLEAN3_17194 ###
SRCR(2). Possibly incomplete.
###Gene_Info_Comments GLEAN3_17453 ###
SigPep-SRCR(2)-TM.  
###Gene_Info_Comments GLEAN3_17933 ###
SRCR(4). Possibly incomplete.
###Gene_Info_Comments GLEAN3_18252 ###
SRCR(8)-TM. Possibly incomplete.
###Gene_Info_Comments GLEAN3_18429 ###
SRCR(13). Probably incomplete. See GLEAN3_18430.
###Gene_Info_Comments GLEAN3_18430 ###
SigPep-SRCR(5). Probably incomplete. See GLEAN3_18429.
###Gene_Info_Comments GLEAN3_18508 ###
SigPep-SRCR(2)-TM.
###Gene_Info_Comments GLEAN3_18737 ###
SigPep-SRCR(3)-TM. 
###Gene_Info_Comments GLEAN3_18939 ###
SRCR(5)-TM. Possibly incomplete.
###Gene_Info_Comments GLEAN3_18985 ###
SRCR(3). Possibly incomplete.
###Gene_Info_Comments GLEAN3_20917 ###
allele: GLEAN3_21818
###Gene_Info_Comments GLEAN3_21818 ###
Short, 8kb contig containing 6 exons coding for N-terminal 254 aa; allele of full length GLEAN3_20917 (17 exons).
###Gene_Info_Comments GLEAN3_04177 ###
3 exons in 8 kb contig coding for C-terminal 75 aa.
###Gene_Info_Comments GLEAN3_15969 ###
GLEAN3_15969 may contain exons from two different genes. The N-terminal cds (182 aa) is homologous to cytochrome P450; the C-terminal cds is homologous to Exoc4. Alleles: GLEAN3_01083, GLEAN3_01084, GLEAN3_22381.
###Gene_Info_Comments GLEAN3_01084 ###
GLEAN3_01084 contains 15 exons coding for C-terminal 2/3 of Sp-Exoc4; GLEAN3_01083 contains 8 exons coding for the N-terminus. Alleles: GLEAN3_22381, GLEAN3_15969.
###Gene_Info_Comments GLEAN3_01083 ###
GLEAN3_01083 contains 8 exons coding for N-terminal 1/3 of Sp-Exoc4; GLEAN3_01084 contains 15 exons coding for the C-terminus. Alleles: GLEAN3_22381, GLEAN3_15969.

###Gene_Info_Comments GLEAN3_22381 ###
GLEAN3_22381 occupies the entire length of a short (13 kb) contig; contains 8 exons coding for the N-terminus of Sp-Exoc4. Alleles: GLEAN3_01083, GLEAN3_01084, GLEAN3_15969.
###Gene_Info_Comments GLEAN3_19241 ###
SigPep-SRCR(4). Possibly incomplete.  Near 7TM SRCR [GLEAN3_19239].
###Gene_Info_Comments GLEAN3_13045 ###
allele: GLEAN3_20544
###Gene_Info_Comments GLEAN3_20544 ###
GLEAN3_20544 appears to be truncated, containing 3 exons located near the edge of the contig, coding for the C-terminal 155 aa of Sp-Exoc6. GLEAN3_13045 is a full-length allele, 19 exons.
###Gene_Info_Comments GLEAN3_19262 ###
SRCR(2). Probably incomplete. See GLEAN3_19263.
###Gene_Info_Comments GLEAN3_19263 ###
SRCR(3). Probably incomplete. See GLEAN3_19262.
###Gene_Info_Comments GLEAN3_01461 ###
Similar to PTPR phi, short or long insert varient...a member of the PTPR BHJOQ superfamily? 
###Gene_Info_Comments GLEAN3_19291 ###
SRCR(2). Probably incomplete.
###Gene_Info_Comments GLEAN3_19370 ###
SRCR(5)-Sushi-TM. Possibly incomplete. Like gi|8547243|gb|AAF76316.1|AF228824_1 scavenger receptor cysteine-rich protein variant 1 [Strongylocentrotus purpuratus] and >gi|8547245|gb|AAF76317.1|AF228825_1 scavenger receptor cysteine-rich protein variant 2 [Strongylocentrotus purpuratus] from coelomocytes (as published by Z. Pancer).
###Gene_Info_Comments GLEAN3_19374 ###
SRCR(2)-HYR-SRCR(2). possibly incomplete. See GLEAN3_19370.
###Gene_Info_Comments GLEAN3_19479 ###
SRCR(7). Possibly incomplete.
###Gene_Info_Comments GLEAN3_19826 ###
SRCR(5)-TM. Possibly incomplete.
###Gene_Info_Comments GLEAN3_20081 ###
SRCR(5). Possibly incomplete.
###Gene_Info_Comments GLEAN3_20161 ###
SRCR(3). Possibly incomplete.
###Gene_Info_Comments GLEAN3_18071 ###
Part of PTPRB??
###Gene_Info_Comments GLEAN3_20273 ###
GF(2)-SRCR(2). Possibly incomplete.
###Gene_Info_Comments GLEAN3_20597 ###
SRCR(8)-Sushi(3)-HYR-Sushi_HYR(2)-Sushi(2). Possibly incomplete.
###Gene_Info_Comments GLEAN3_20650 ###
TIL-SRCR(4). Possibly incomplete. Similar to >gi|8547241|gb|AAF76315.1|AF228823_1 scavenger receptor cysteine-rich protein [Strongylocentrotus purpuratus] published by Z. Pancer.
###Gene_Info_Comments GLEAN3_20822 ###
SigPep-SRCR(4)-TM.
###Gene_Info_Comments GLEAN3_20868 ###
SigPep-SRCR(2). Possibly incomplete.
###Gene_Info_Comments GLEAN3_21124 ###
SigPep-SRCR(6)-TM.
###Gene_Info_Comments GLEAN3_21348 ###
SigPep-SRCR(3). Possibly incomplete.
###Gene_Info_Comments GLEAN3_21457 ###
SRCR(2). Probably incomplete.
###Gene_Info_Comments GLEAN3_21509 ###
SigPep-SRCR(6). Possibly incomplete.
###Gene_Info_Comments GLEAN3_21691 ###
SigPep-SRCR(2). Probably incomplete. See GLEAN3_21692.
###Gene_Info_Comments GLEAN3_04193 ###
This Glean sequence corresponds to 2 different genes: the complete sequence for Hhat and the 3' portion of PTPRscav, a novel PTPR, containing the PTP domain.
###Gene_Info_Comments GLEAN3_21692 ###
SRCR(4)-TM. Probably incomplete. See GLEAN3_21691.
###Gene_Info_Comments GLEAN3_21782 ###
SRCR(3). Possibly incomplete.
###Gene_Info_Comments GLEAN3_21789 ###
SRCR(7)-TM. Possibly incomplete.
###Gene_Info_Comments GLEAN3_08670 ###
Partial sequence.  Contains one protein tyrosine phosphatase domain.
###Gene_Info_Comments GLEAN3_21890 ###
SRCR(3). Probably incomplete.
###Gene_Info_Comments GLEAN3_21987 ###
SRCR(5)-TM. Probably incomplete. See GLEAN3_21988.
###Gene_Info_Comments GLEAN3_21988 ###
SigPep-SRCR(5). Probably incomplete. See GLEAN3_21987.
###Gene_Info_Comments GLEAN3_22000 ###
Somat-SRCR-Somat-SRCR-CUB.
###Gene_Info_Comments GLEAN3_28592 ###
Homologous to PTPN 14/21.  Also similar to the novel Drosophila protein PTPpez.
###Gene_Info_Comments GLEAN3_22085 ###
SRCR(5)-TM. Possibly incomplete.
###Gene_Info_Comments GLEAN3_22145 ###
SigPep-SRCR(6). Probably incomplete. See GLEAN3_22146, 22147, 22148, 22149, 22150, 22151.
###Gene_Info_Comments GLEAN3_22146 ###
SRCR(5). Probably incomplete. See GLEAN3_22145, 22147, 22148, 22149, 22150, 22151.
###Gene_Info_Comments GLEAN3_22147 ###
SRCR(3). Probably incomplete. See GLEAN3_22145, 22146, 22148, 22149, 22150, 22151.
###Gene_Info_Comments GLEAN3_22148 ###
SRCR(4). Probably incomplete. See GLEAN3_22145, 22146, 22147, 22149, 22150, 22151.
###Gene_Info_Comments GLEAN3_22149 ###
SRCR(8). Probably incomplete. See GLEAN3_22145, 22146, 22147, 22148, 22150, 22151.
###Gene_Info_Comments GLEAN3_22150 ###
SRCR(5). Probably incomplete. See GLEAN3_22145, 22146, 22147, 22148, 22149, 22151.
###Gene_Info_Comments GLEAN3_22151 ###
SRCR(2). Probably incomplete. See GLEAN3_22145, 22146, 22147, 22148, 22149, 22150.
###Gene_Info_Comments GLEAN3_03149 ###
Partial sequence.
###Gene_Info_Comments GLEAN3_03479 ###
Similar to CDKN3
###Gene_Info_Comments GLEAN3_02928 ###
Partial sequence. Homologous to human TPIP/TPTE.
###Gene_Info_Comments GLEAN3_03142 ###
Partial sequence. PTENb...See also GLEAN3_23882 (identical gene) and GLEAN3_09522 (PTENa).
###Gene_Info_Comments GLEAN3_22991 ###
Identical to 11961 (11961 is missing aa 73-109) except for 5'end 
###Gene_Info_Comments GLEAN3_11961 ###
Identical to 22991 except for 5'end (note 11961 is missing aa 73-109 of 22991)
###Gene_Info_Comments GLEAN3_28117 ###
AA 28-89 IDENTICAL TO GLEAN3_20429, but 5' and 3' end differ
###Gene_Info_Comments GLEAN3_11467 ###
Identical to 06006 (except 06006 is missing aa 1-69)
###Gene_Info_Comments GLEAN3_06006 ###
IDENTICAL TO 11467 (except missing aa 1-69 of 11467)
###Gene_Info_Comments GLEAN3_06389 ###
same as GLEAN3_06388 (IDENTICAL except at 3' END)
###Gene_Info_Comments GLEAN3_12085 ###
IDENTICAL TO 11970
###Gene_Info_Comments GLEAN3_03228 ###
almost IDENTICAL TO 17155
###Gene_Info_Comments GLEAN3_17155 ###
almost DENTICAL TO 03228
###Gene_Info_Comments GLEAN3_18282 ###
identical to 13863
###Gene_Info_Comments GLEAN3_24480 ###
PREDICTED: similar to castor homolog 1, zinc finger 
###Gene_Info_Comments GLEAN3_18645 ###
PREDICTED: Strongylocentrotus purpuratus similar to RNA binding motif protein 13 (LOC586606), mRNA
###Gene_Info_Comments GLEAN3_07493 ###
PREDICTED: Strongylocentrotus purpuratus similar to Beta-amyloid-like protein precursor (LOC585391), mRNA

###Gene_Info_Comments GLEAN3_17861 ###
involved in meiotic recombination found originally in yeast, 

Domain search
gnl|CDD|16546 cd00223, Topo6_Spo, DNA topoisomerase VI subunit A. Homologous to type II topoisomerase, meiotic recombination factor, Spo11; generates double stranded breaks that initiate homologous recombination during meiosis. Subunit A forms homodimers which contain a deep groove that spans both protomers; the dimer architecture suggests that DNA is bound in the groove across the A subunit interface, and that the two monomers separate during DNA transport..

###Gene_Info_Comments GLEAN3_15850 ###
PREDICTED: Strongylocentrotus purpuratus similar to CUG triplet repeat,RNA-binding protein 2 (LOC576912), partial mRNA.
###Gene_Info_Comments GLEAN3_02220 ###
PREDICTED: Strongylocentrotus purpuratus similar to survival of motor neuron 1, telomeric isoform b (LOC575759)
###Gene_Info_Comments GLEAN3_22717 ###
non-identical duplicate of GLEAN3_27144, _02836. A portion of the overlap is identical, but these do appear to be distinct genes.
###Gene_Info_Comments GLEAN3_06197 ###
This is the N terminal part of the protein. The middle portions are in GLEAN3_27144 and _02836
###Gene_Info_Comments GLEAN3_06784 ###
Non-identical to other PI3K p110 GLEANs: 22717, 06197, 27144, 02836. This gene may be full length.
###Gene_Info_Comments GLEAN3_17593 ###
PREDICTED: Strongylocentrotus purpuratus similar to baculoviral IAP repeat-containing 2 (LOC583399)
###Gene_Info_Comments GLEAN3_04809 ###
partial sequence
###Gene_Info_Comments GLEAN3_28521 ###
PREDICTED: Strongylocentrotus purpuratus similar to Elongation factor G 2, mitochondrial precursor (mEF-G 2) (Elongation factor G2) (LOC584381)
###Gene_Info_Comments GLEAN3_10910 ###
partial sequence; non-identical to other P13K p110 proteins. This sequence seems to be internal to 27144 and 02836. This sequence may be the C terminus that corresponds to the N terminus in GLEAN3_05073
###Gene_Info_Comments GLEAN3_13194 ###
contains SINA domains

gnl|CDD|26029 pfam03145, Sina, Seven in absentia protein family. The seven in absentia (sina) gene was first identified in Drosophila. The Drosophila Sina protein is essential for the determination of the R7 pathway in photoreceptor cell development: the loss of functional Sina results in the transformation of the R7 precursor cell to a non- neuronal cell type. The Sina protein contains an N-terminal RING finger domain pfam00097. Through this domain, Sina binds E2 ubiquitin-conjugating enzymes (UbcD1) Sina also interacts with Tramtrack (TTK88) via PHYL. Tramtrack is a transcriptional repressor that blocks photoreceptor determination, while PHYL down-regulates the activity of TTK88. In turn, the activity of PHYL requires the activation of the Sevenless receptor tyrosine kinase, a process essential for R7 determination. It is thought that thus Sina targets TTK88 for degradation, therefore promoting the R7 pathway. Murine and human homologues of Sina have also been identified. The human homologue Siah-1 also binds E2 enzymes (UbcH5) and through a series of physical interactions, targets beta-catenin for ubiquitin degradation. Siah-1 expression is enhanced by p53, itself promoted by DNA damage. Thus this pathway links DNA damage to beta-catenin degradation. Sina proteins, therefore, physically interact with a variety of proteins. The N-terminal RING finger domain that binds ubiquitin conjugating enzymes is described in pfam00097, and does not form part of the alignment for this family. The remainder C-terminal part is involved in interactions with other proteins, and is included in this alignment. In addition to the Drosophila protein and mammalian homologues, whose similarity was noted previously, this family also includes putative homologues from Caenorhabditis elegans, Arabidopsis thaliana..
###Gene_Info_Comments GLEAN3_05073 ###
This appears to be the N terminus; the C terminus may be on GLEAN3_10910
###Gene_Info_Comments GLEAN3_01697 ###
This gene prediction is only N-terminus of GAT and should be combined with GLEAN3_01698.  
###Gene_Info_Comments GLEAN3_10777 ###
Fas Associated death domain.  
###Gene_Info_Comments GLEAN3_00102 ###
GLEAN3_04413 is a perfect duplicate of this glean (protein level)
###Gene_Info_Comments GLEAN3_04413 ###
This ia a perfect duplicate of GLEAN3_00102 (protein level)
###Gene_Info_Comments GLEAN3_21153 ###
partial protein; GLEAN3_19900 is a nearly perfect internal duplicate
###Gene_Info_Comments GLEAN3_07693 ###
Groups with caspase 9 subfamily by neighbor joining of multiple sequence alignmnent.  Appears to lack an N-terminus.
###Gene_Info_Comments GLEAN3_08540 ###
Groups with caspase 8 subfamily in neighbor-joining of multiple sequence alignment.
###Gene_Info_Comments GLEAN3_09497 ###
Very high similarity to GLEAN3_11339, GLEAN3_22941, GLEAN3_26645, GLEAN3_21561, GLEAN3_26743, 
###Gene_Info_Comments GLEAN3_11339 ###
Very high similarity to GLEAN3_09497, GLEAN3_22941, GLEAN3_26645, GLEAN3_21561, and GLEAN3_26743.
###Gene_Info_Comments GLEAN3_11471 ###
Nearly identical to C-terminus of GLEAN3_17523, GLEAN3_09653, and also with significant similarity to GLEAN3_19822, GLEAN3_09497, GLEAN3_11339.  The tiling expression data indicate that some exons are missing from this model.
###Gene_Info_Comments GLEAN3_11061 ###
This is the ligand binding domain of gcnf1, and belongs with glean3_00749 (see that model for data).
###Gene_Info_Comments GLEAN3_16039 ###
Groups with the caspase 8 subfamily by neighbor joining of multiple sequence alignment.  Sister of caspase8-like/caspase8-like-2a/b.
###Gene_Info_Comments GLEAN3_18137 ###
Highly similar to GLEAN3_18315, may be a duplication.  See annotation to that gene.  It is not clear where the N-terminus of this model is.
###Gene_Info_Comments GLEAN3_18315 ###
Most similar to GLEAN3_18137; the latter may be a duplication of the C-terminus of this gene, possibly an artifact.  This model contains two capsase domains that are not identical, and may represent a tandem duplication.  N-terminus is ambiguous (no Methionine).
###Gene_Info_Comments GLEAN3_18506 ###
Groups with the caspase 9 subfamily by neighbor joining of multiple sequence alignment.
###Gene_Info_Comments GLEAN3_20623 ###
Groups with caspase 8 subfamily by neighbor joining of multiple sequence alignment.
###Gene_Info_Comments GLEAN3_22177 ###
Similar to GLEAN3_08540, GLEAN3_19839, GLEAN3_16039.
###Gene_Info_Comments GLEAN3_14473 ###
partial sequence.
GLEAN3_26766 is a non-identical duplicate

###Gene_Info_Comments GLEAN3_19900 ###
GLEAN3_21153 is a nearly identical duplicate that is longer on both ends.
###Gene_Info_Comments GLEAN3_23504 ###
GLEAN3_08853 is a non-identical duplicate

###Gene_Info_Comments GLEAN3_10284 ###
partial sequence; this GLEAN is a nearly identical internal duplicate of GLEAN3_00151
###Gene_Info_Comments GLEAN3_08853 ###
GLEAN3_23504 is a non-identical duplicate
###Gene_Info_Comments GLEAN3_25578 ###
glean3_16326 is a partial sequence of this one
###Gene_Info_Comments GLEAN3_16326 ###
glean3_25578 is a partial sequence of this one
###Gene_Info_Comments GLEAN3_15545 ###
unclear exactly which RTK this is.
###Gene_Info_Comments GLEAN3_26766 ###
GLEAN3_14473 is a non-identical duplicate
###Gene_Info_Comments GLEAN3_19045 ###
this is a partial sequence of Glean3_04954
###Gene_Info_Comments GLEAN3_26272 ###
not clear which RTK this is
###Gene_Info_Comments GLEAN3_18556 ###
PREDICTED: Strongylocentrotus purpuratus similar to peroxisome biogenesis factor 1 (LOC592476)
###Gene_Info_Comments GLEAN3_00400 ###
PREDICTED: Strongylocentrotus purpuratus similar to folliculin isoform 1 (LOC587400)
###Gene_Info_Comments GLEAN3_07667 ###
This GLEAN may code for either the RBM16 or SFRS15 protein. 
###Gene_Info_Comments Sp-IL17r-like ###
This gene was annotated based on a manual inspection of multiple protein alignments, reciprocal BLASTing and domain structure analyses.

Given the position of this gene at the end of a scaffold, and based on alignments to other IL17 receptor genes, it is highly likely that there is N-ter sequence missing in this model.
###Gene_Info_Comments GLEAN3_01101 ###
GLEAN3_01101 matches the first half of RBM19. GLEAN3_06990 matches the latter half.
###Gene_Info_Comments GLEAN3_06990 ###
GLEAN3_01101 matches the first half of RBM19. GLEAN3_06990 matches the latter half.
###Gene_Info_Comments GLEAN3_03581 ###
There appear to be 2 gene sequences in this Glean site.  The first one corresponds to a partial sequence for a receptor protein tyrosine phosphatase (type Mu?) and the second one is similar to Alpha-2,6-sialyltransferase.
###Gene_Info_Comments GLEAN3_01558 ###
DNA polymerase lambda mediates a back-up base excision repair activity by similarity.

###Gene_Info_Comments GLEAN3_03548 ###
The zf-C4 domain is located in glean3_03547.
###Gene_Info_Comments GLEAN3_08365 ###
The mouse Ogg1 gene is involved in the repair of 8-hydroxyguanine in DNA damage.

###Gene_Info_Comments GLEAN3_18631 ###
Using HsSERPINE2 in BLASTP search of the GLEAN3 models yielded 15 genes. Each was analyzed by SMART and the top 12 hits were found to have a SERPIN domain and were annonated Sp-serpin-like 3 through 12 (2 were already annotated by A.Cameron). BLASTP searches of the GLEAN3 models with other vertebrate Serpins yielded the same top 12 hits.
###Gene_Info_Comments GLEAN3_18630 ###
Using HsSERPINE2 in BLASTP search of the GLEAN3 models yielded 15 genes. Each was analyzed by SMART and the top 12 hits were found to have a SERPIN domain and were annonated Sp-serpin-like 3 through 12 (2 were already annotated by A.Cameron). BLASTP searches of the GLEAN3 models with other vertebrate Serpins yielded the same top 12 hits.
###Gene_Info_Comments GLEAN3_18196 ###
Using HsSERPINE2 in BLASTP search of the GLEAN3 models yielded 15 genes. Each was analyzed by SMART and the top 12 hits were found to have a SERPIN domain and were annonated Sp-serpin-like 3 through 12 (2 were already annotated by A.Cameron). BLASTP searches of the GLEAN3 models with other vertebrate Serpins yielded the same top 12 hits.
###Gene_Info_Comments GLEAN3_20278 ###
This gene model could be incomplete. It is located at the end of a scaffold.
Using HsSERPINE2 in BLASTP search of the GLEAN3 models yielded 15 genes. Each was analyzed by SMART and the top 12 hits were found to have a SERPIN domain and were annonated Sp-serpin-like 3 through 12 (2 were already annotated by A.Cameron). BLASTP searches of the GLEAN3 models with other vertebrate Serpins yielded the same top 12 hits.
###Gene_Info_Comments GLEAN3_06446 ###
GLEAN3_23112 encodes first part of the gene and GLEAN3_06446 the latter half.
###Gene_Info_Comments GLEAN3_02644 ###
Alternative splicing; DNA damage; DNA repair; Lyase; Nuclear protein by similarity.
###Gene_Info_Comments GLEAN3_22678 ###
There may be a misassembly - the zinc finger part of this protein is missing.
###Gene_Info_Comments GLEAN3_09715 ###
DNA damage; DNA repair; DNA replication; DNA synthesis; DNA-binding; DNA-directed DNA polymerase; Magnesium; Metal-binding; Mutator protein; Nuclear protein by similarity.
###Gene_Info_Comments GLEAN3_08719 ###
Alternative splicing; DNA damage; DNA repair; DNA replication; DNA synthesis; DNA-binding; DNA-directed DNA polymerase; Magnesium; Metal-binding; Mutator protein; Nuclear protein by similarity.
###Gene_Info_Comments GLEAN3_12887 ###
Anti-oncogene; ATP-binding; Cell cycle; Disease mutation; DNA damage; DNA repair; DNA-binding; Nuclear protein; Nucleotide-binding; by similarity.
###Gene_Info_Comments Sp-CD59/Sca2-like1 ###
This model was created based on alignments and reciprocal BLASTing of a FgeneshAB/++ prediction with CD59 and Sca2 (two closely related genes in sequence and domain structure). Given the simple structure/sequence of these genes, and since this annotation was based purely on bioinformatic evidence, we have decided to name this gene CD59/Sca2-like.
###Gene_Info_Comments Sp-CD59/Sca2-like2 ###
This model was created based on alignments and reciprocal BLASTing of a FgeneshAB/++ prediction with CD59 and Sca2 (two closely related genes in sequence and domain structure). Given the simple structure/sequence of these genes, and since this annotation was based purely on bioinformatic evidence, we have decided to name this gene CD59/Sca2-like.
###Gene_Info_Comments GLEAN3_19819 ###
A segment of this model is duplicated on GLEAN3_25718.

NTH1 is a DNA glycosylase that excise 5-formyluracil,
5-hydroxymethyluracil and Thymine glycol in human cells
###Gene_Info_Comments GLEAN3_06294 ###
Mitochondrial DNA polymerase, DNA polymerase gamma by similarity.

###Gene_Info_Comments GLEAN3_27975 ###
Role in DNA mismatch repair by similarity.
###Gene_Info_Comments GLEAN3_17492 ###
Transcriptome data indicates that Glean may have falsely predicted the following exons: 1.
###Gene_Info_Comments GLEAN3_25239 ###
also see glean model 18861.
###Gene_Info_Comments GLEAN3_03547 ###
Ligand binding domain is in glean3_03548
###Gene_Info_Comments GLEAN3_04540 ###
May have a couple of extra exons predicted.
###Gene_Info_Comments GLEAN3_22735 ###
GLEAN3_02789 has the first part of the PPIE gene and GLEAN3_22735 has the latter half. There appears to be a small overlap between the two predictions.
###Gene_Info_Comments GLEAN3_02789 ###
GLEAN3_02789 has the first part of the PPIE gene and GLEAN3_22735 has the latter half. There appears to be a small overlap between the two predictions.
###Gene_Info_Comments GLEAN3_00875 ###
e-val for NP_878906 = e-136.
This peptide is identical in length (476aas) and sequence to Glean3_10081 on scaffold 6159.
Annotated by RA Obar, RL Morris and AM Musante.
###Gene_Info_Comments GLEAN3_02954 ###
e-val for NP_060111= 8e-123, [Homo sapiens]. 
Kinesin-2 family member.
Annotated by RA Obar, RL Morris, SC Cummings, EA Kovacs, and EJ Jin.
###Gene_Info_Comments GLEAN3_03717 ###
E value for NP_659464 = 5e-72 KIF6 [Homo sapiens]
Annotation by RA Obar, RL Morris, AL Silverio, BJ Chick, and B Rossetti.


###Gene_Info_Comments GLEAN3_04939 ###
e val for NP_085118 = 1e-69.
Probably incomplete based on its size (390 AA)
Same length but different sequence as GLEAN3_12452.
Annotation by RA Obar, RL Morris, BA Jeffrey,  J Bhatia, AM Musante.
###Gene_Info_Comments GLEAN3_05509 ###
e val for NP_008985= 6e-42; kinesin family member 3A [Homo sapiens].
Annotation by RA Obar, RL Morris, LE Shorey, SA Tower, and EJ Jin.
###Gene_Info_Comments GLEAN3_05741 ###
e val = e-148 for NP_006836, KIF2C [Homo sapiens].
Annotated by RA Obar, RL Morris, R Yen,  JM Fess, and AP Rawson.
###Gene_Info_Comments GLEAN3_06602 ###
The e value was 0 for CAC20443/Q9H193. ?KINESIN-13A2, ACCESSION Q9H193, swissprot locus Q9H193_HUMAN?  
Motor domain is KIF1-like.
Kinesin-3 family.
Annotated by RA Obar, BD Dyer, RL Morris, and B Rossetti.


-BDD

 
 
###Gene_Info_Comments GLEAN3_07505 ###
 e = 0 for NP_659464.
Annotated by RA Obar, BD Dyer, RL Morris, and B Rossetti.
 
###Gene_Info_Comments GLEAN3_10570 ###
e-val for NP_005724= 5e-93, [Homo sapiens].
Annotated by RA Obar, RL Morris, SC Cummings, EA Kovacs, and B Rossetti.
###Gene_Info_Comments GLEAN3_11353 ###
e val = 3e-61 for NP_071396, KIF13A [Homo sapiens].
Annotated by RA Obar, RL Morris, R Yen, JM Fess, and B Rossetti.
###Gene_Info_Comments GLEAN3_12452 ###
e val = 1e-49 for NP_085118.
Same length but different sequence as GLEAN3_04939.
Annotated by RA Obar, BD Dyer, RL Morris, AM Musante.
###Gene_Info_Comments GLEAN3_15247 ###
May play a role in nucleotide excision repair (NER) and RNA polymerase II (POL II) transcription by interacting with ERCC2/XPD and ERCC3/XPB helicases, both subunits of NER-transcription factor TFIIH (by similarity).
###Gene_Info_Comments GLEAN3_14451 ###
e val for NP_002254 = 3e-34
Exact match of XP_796037: PREDICTED: similar to kinesin family member C1 [Strongylocentrotus purpuratus]. Explained by Scaffold3656. Contains 8 exons.
Annotation by RA Obar, RL Morris, BA Jeffrey, J Bhatia, B Rossetti, EJ Jin, KM Judkins
###Gene_Info_Comments GLEAN3_15354 ###
e val = 6e-28 for NP_004789, KIF3B [Homo sapiens].
Contains partial kinesin motor domain, N terminus missing,  when compared to human CENP-E motor domain.  Likely to be a fragment.
Annotated by RA Obar, RL Morris, R Yen, JM Fess, and EJ Jin.
###Gene_Info_Comments GLEAN3_16067 ###
e val for CAI12999 = 3e-84.
e val for NP_006836 = e-52; kinesin family member 2C [Homo sapiens].
Annotation by RA Obar, RL Morris, SA Tower, and B Rossetti.
###Gene_Info_Comments GLEAN3_16655 ###
e val for AAI10990=1e-111.
e val for NP_112494=5e-81; kinesin family member 18A [Homo sapiens].  
Annotation by RA Obar, RL Morris, LE Shorey, SA Tower, KM Judkins
###Gene_Info_Comments GLEAN3_17809 ###
accession number (Swiss-Prot): Q02224
Likely a fragment based on its short length.
e val for Q02224 is 3e-25
Annotation by RA Obar, RL Morris, BA Jeffrey, KM Judkins
###Gene_Info_Comments GLEAN3_17289 ###
e val for AAH73878 = 2e-45.
e val for NP_002254 = 4e-47.
Annotated by RA Obar and RL Morris, KM Judkins
###Gene_Info_Comments GLEAN3_21586 ###
Hydrolysis of the deoxyribose N-glycosidic bond to excise 3-methyladenine, and 7-methylguanine from the damaged DNA polymer formed by alkylation lesions (by similarity).
###Gene_Info_Comments GLEAN3_28109 ###
GLEAN3_18262 sequence is a close but not identical match of the model
also
GLEAN3_14629 is a duplicate of the N-terminal region of the model
also
GLEAN3_18263 contains a sequence match to mid region section of the model
###Gene_Info_Comments GLEAN3_18388 ###
e val = e-109 for NP_904325, KIF1B [Homo sapiens] 
see also GLEAN3_18764, also likely ortholog of KIF1B isoform alpha.
Annotated by RA Obar, RL Morris, R Yen, and IJ Strachan
###Gene_Info_Comments GLEAN3_18533 ###
e val = 8e-147 against AAH35896. 
e val = e val = e-150 against Q8IUN3_HUMAN Q8IUN3 (UniProtKB/TrEMBL Accession Number)
Annotation by RA Obar, RL Morris, AL Silverio, BJ Chick, and EJ Jin
###Gene_Info_Comments GLEAN3_18764 ###
e val for NP_904325 is 8e-109.
Annotation by RA Obar, RL Morris, BA Jeffrey, and IJ Strachan
###Gene_Info_Comments GLEAN3_19165 ###
e val 0 against AAG18582/NP_999659, KRP110 [Strongylocentrotus purpuratus].
e val for NP_612565=e-161; kinesin family member 23 Isoform 1 [Homo sapiens].  
Annotation by RA Obar, RL Morris, LE Shorey, SA Tower, and EJ Jin.
###Gene_Info_Comments GLEAN3_19775 ###
e value = 5e-37 for NP_612433; kinesin family member 12 [Homo sapiens].
Annotation by R. A. Obar and R.L. Morris
###Gene_Info_Comments GLEAN3_20767 ###
e val for NP_056069=e-172; kinesin family member 13B [Homo sapiens].
Annotation by RA Obar, RL Morris, SA Tower
###Gene_Info_Comments GLEAN3_24588 ###
e val = 1e-54 for NP_112494.
likely a fragment, based on its short length.
Annotated by RA Obar and RL Morris, KM Judkins
###Gene_Info_Comments GLEAN3_23560 ###
Q4VXC4 is UniProtKB accession number.
e val for BAE02544 = 8e-106, and for Q4VXC4 = e-104.
see also: GLEAN3_18388, GLEAN3_18764, GLEAN3_20634, which are all KIF1B-like.
Annotation by: RA Obar, RL Morris, BA Jeffrey, and IJ Strachan
###Gene_Info_Comments GLEAN3_22982 ###
e val = 8e-29 for NP_004789, KIF3B [Homo sapiens].
Annotated by RA Obar, RL Morris, AM Musante, and EJ Jin
###Gene_Info_Comments GLEAN3_21317 ###
e val for NP_999656 is 0.0
Annotation by RA Obar, RL Morris, BA Jeffrey, and AP Rawson
###Gene_Info_Comments GLEAN3_22160 ###
e val for NP_112494 = 3e-120
Annotated by RA Obar and RL Morris, KM Judkins
###Gene_Info_Comments GLEAN3_26503 ###
See GLEAN3_26503.
The KRP170 gene spans GLEAN3_20414 (Scaffold1612/Scaffoldi17703) and GLEAN3_26503 (Scaffold56862/Scaffoldi4507).  The mRNA was published as Chui,K.K., Rogers,G.C., Kashina,A.M., Wedaman,K.P., Sharp,D.J., Nguyen,D.T., Wilt,F. and Scholey,J.M.  "Roles of two homotetrameric kinesins in sea urchin embryonic cell division."   J. Biol. Chem. 275 (48), 38005-38011 (2000).  The GenBank entry for this gene is gi|10697491|gb|AF292395.2|.
Annotation by RA Obar, RL Morris, AL Silverio, BJ Chick, EJ Jin.
###Gene_Info_Comments GLEAN3_27784 ###
e val = 2e-128 for CAI95220.  e val = e-130 for NP_055889, KIF1B_beta [Homo sapiens].
Annotation by R.A.Obar R.L. Morris, and AP Rawson 013106
###Gene_Info_Comments GLEAN3_27552 ###
   e val = e-103 for NP_060046, KIF27 [Homo sapiens].
   Annotation by R.A.Obar and R.L. Morris, BJ Rossetti and AP Rawson 013106
###Gene_Info_Comments GLEAN3_21946 ###
Responsible for preventing misincorporation of 8-oxo-dGTP into DNA thus preventing A:T to C:G transversions (by similarity).
###Gene_Info_Comments GLEAN3_12960 ###
GLEAN3_12960 is the N-terminal match to MSH6 mouse.
GLEAN3_28261 is the C-terminal match to MSH6 mouse.
GLEAN3_15322 is the full match to the Query used.
###Gene_Info_Comments GLEAN3_23112 ###
GLEAN3_23113 encodes first part of the gene and GLEAN3_06446 the latter half.
###Gene_Info_Comments GLEAN3_09456 ###
GLEAN3_09456 is almost complete duplicate prediction for GLEAN3_10689 and contains the first half of the gene. GLEAN3_15881 contains the latter half.
###Gene_Info_Comments GLEAN3_15881 ###
GLEAN3_09456 is almost complete duplicate prediction for GLEAN3_10689 and contains the first half of the gene. GLEAN3_15881 contains the latter half.
###Gene_Info_Comments GLEAN3_03050 ###
The N-terminal sequence of the mouse RAD52 used for Query of the GLEAN3 models is covered by combining

GLEAN3_03050 
with
GLEAN3_06081 


###Gene_Info_Comments GLEAN3_08906 ###
regions of GLEAN3_08906 are present in GLEAN3_26677 and GLEAN3_28391 also.
###Gene_Info_Comments GLEAN3_26263 ###
A region of GLEAN3_26263 is present in GLEAN3_23612 also.
###Gene_Info_Comments GLEAN3_10556 ###
Regions of GLEAN3_10556 are present in GLEAN3_00741 (covers 299-621) and GLEAN3_04597 (covers 1-203) also.
###Gene_Info_Comments GLEAN3_02282 ###
GLEAN3_02282 covers the c-terminal region of Rev3.
GLEAN3_02281 covers the n-terminal region of Rev3.
###Gene_Info_Comments GLEAN3_02281 ###
GLEAN3_02281 covers the n-terminal region of Rev3.
GLEAN3_02282 covers the c-terminal region of Rev3.
###Gene_Info_Comments GLEAN3_26626 ###
This prediction includes only N-terminus and should be combined with GLEAN3_26627.  
###Gene_Info_Comments GLEAN3_17523 ###
Highly similar to GLEAN3_09653, GLEAN3_11471, GLEAN3_13801.  Presence of a sushi domain and patchiness of scaffold sequence suggests that this model might artifactually fuse two different genes.
###Gene_Info_Comments GLEAN3_21133 ###
Amino acid sequence highly similar but not identical to that of Glean3_01683; may be haplotype based on high similarity of intron sequences.  Model incomplete, lacking many exons both 5' and 3'.  (Or, may be a pseudogene.)
###Gene_Info_Comments GLEAN3_10755 ###
Partial sequence.
###Gene_Info_Comments GLEAN3_11070 ###
Partial sequence.  This contains the serine/threonine phosphatase, family 2C, catalytic (Pp2Cc) domain .  See also GLEAN3_11294.
###Gene_Info_Comments GLEAN3_11705 ###
Identical to Glean3_17161
###Gene_Info_Comments GLEAN3_17161 ###
See also GLEAN3_11705.
###Gene_Info_Comments GLEAN3_15869 ###
beginning of the gene is missing (probably in the gap between contings in the scaffold). The end of the gene is probably on another scaffold as well
###Gene_Info_Comments GLEAN3_13821 ###
GLEAN3_02088 appears to be identical.
###Gene_Info_Comments GLEAN3_05107 ###
utrs were addad based on est and tiling data
###Gene_Info_Comments GLEAN3_05433 ###
GLEAN3_05434 is a partial duplicate prediction for GLEAN3_05433. Looking at the Genboree data, it may represent an error in the assembly process. 
###Gene_Info_Comments GLEAN3_17434 ###
Partial sequence containing PTP catalytic domain.
###Gene_Info_Comments GLEAN3_05434 ###
GLEAN3_05434 is a partial duplicate prediction for GLEAN3_05433. Looking at the Genboree data, it may represent an error in the assembly process.
###Gene_Info_Comments GLEAN3_17234 ###
Partial sequence. Contains most of the PTP catalytic domain.
###Gene_Info_Comments GLEAN3_14642 ###
Partial sequence containing the PTP catalytic domain.
###Gene_Info_Comments GLEAN3_25671 ###
First exons more difficult to detect on expression microarray.  Similarity to known TAF1A proteins (mouse, rat, and human) show little similarity towards the carboxyl end, suggesting possible misprediciton of early exons (2 and 3).
###Gene_Info_Comments GLEAN3_00838 ###
Partial duplication in GLEAN3_14788
###Gene_Info_Comments GLEAN3_14788 ###
Partial duplication of GLEAN3_00838.
###Gene_Info_Comments GLEAN3_20923 ###
Added 5'UTRs to the gene model based on the est data
###Gene_Info_Comments GLEAN3_04533 ###
Partial sequence similar to DUSP23.
###Gene_Info_Comments GLEAN3_06579 ###
In light of evidence from hybridization array, the Fgeneh gene prediction model goes a better job than the Glean3 model, though the 3' UTR predicted by Glean3 appears accurate and has been included here.
###Gene_Info_Comments GLEAN3_08840 ###
Shows similar genbank hits to Glean3_09651, though the two sequences show little similarity to each other.  Not highly confident in this annotation.
###Gene_Info_Comments GLEAN3_09651 ###
Unclear what the exact homolog in non-urchins is.  Clusters tightly with Glean3_08840 (annotated as SpTFIIA alpha).  
###Gene_Info_Comments GLEAN3_05661 ###
Little expression evidence for exons one, two and six.  But this is likely the correct gene and for at least the other exons shows good evidence for expression.  
###Gene_Info_Comments GLEAN3_02637 ###
Putative conserved RPB5 domain detected in BlastP search.
All exons are supported by tiling path array data.
However, query of Poustka's database does not retrieve significant hits.
###Gene_Info_Comments GLEAN3_14575 ###
This GLEAN is the C terminal part of DKG-beta; the N terminal part is in GLEAN3_14574. The sequence has been modified below to what is probably correct: the first few amino acids were deleted, while a spurious, non-homologous internal sequence was sectioned off and indicated as not real.
DNA:
GATTCAAGGAAAACACTCAAAGATGTTTTAGAAGAATTCCATGGAAATGGAGTATTATCCAAATACAACCCCGAGCAGCCCATCAACTATGAAGGCTTCAAGCTCTTCATGGAGACTTACCTTGACGTTGACATGCCAGAGGATCTCTGTCGACACCTCTTCCTCTCCTTCGTCAAGAAGACACCTTCC

{GTGTCCAGGTCTACAAGTAGAGAGAAGGAGAGGGGCGCCATCTATGATGTCGGCAGTGCCGTCGCTACCGTGACCACCACCGCCGCGTGTGCCGCCATCACAGGGAGCAGCTCGCCGATTGAAATCGCTGGGCAGACGAGCGGCGGCAACAACAACACAGAAGGAAATGAGCTCACAAACGGTCATGGGCAAGAGTCAGGGGTTGTCACAAGAGGGAGCCCCACCCAGTCACGGAGTTCATCCAAGAAGTCCAATGCCTCTAACGGAACCCCTTCAAGTCATTCCATTCGCACCGATGCACTCAGGGTCGAACAAACCACTAATGATCATGGAGATATCAAGAAAGGGAAATCCATGGAGAACAATAGGCGGAAAAAGACAGCACTCTTCACCGCCCTGCGCAAG} NOT REAL!

ACTAAAAAGAATACCAAAGACACGGACTCACTAGGGGCATTGGCCGCCTCCCGGGGGTCACTCGCACATCCAGATGTCACCAACCAGGTCGTCTACATGAAAGACATCGTCTGCTACTTGTCGTTGCTAGAGGGCGGCAAACCAGAGGATAAATTAGAATTCATGTTCAGGCTTTATGACACAGATGACAATGGTATCCTTGACAGCAGTGAGCTAGATTGTATAGTAAACCAGATGATGCATGTAGCTGAGTATCTAGGCTGGGATGTCACTGAATTAAGACCAATCTTACAAGATATGATGATAGAGATTGACTTTGATTCAGATGGGACCGTATCGTTAGAAGAATGGATACGTGGCGGCATGACAACCATTCCTCTCTTAGTTCTCCTTGGCCTTGAATCAAATGTGAAAGATGATGGCAGTCATGTATGGAGGTTAAAACACTACAACAAACCAGCCTATTGTAACCTCTGCCTCAATCTCCTTGTAGGGTTTGGAAAACAAGGTCTATCATGTACATTCTGCAAATACACGGTTCATGAACGCTGTGTACAACGTGCCCCAGCCTGTTGCATTAGTACATATGTCAAGTCAAAACGGACAACGAATATCATGAACCATCACTGGGTAGAAGGTAACAGCCCGGGTAAATGTGACCGGTGTAAGAAGTCGATTAAGAGCTACAACGGTATTACTGGCCTTCACTGCAGATGGTGCAAGATAACGCTCCATAACAAATGTGCCTCCCACGTCAAACCAGAGTGTAACATGGGAGAATTCCGAGACCACATTCTACCTCCTACGGCTATCTGCCCAGCAGTTCTGGTAAGTTTAGTCACATTTTAA

protein:
DSRKTLKDVLEEFHGNGVLSKYNPEQPINYEGFKLFMETYLDVDMPEDLCRHLFLSFVKKTPS

{VSRSTSREKERGAIYDVGSAVATVTTTAACAAITGSSSPIEIAGQTSGGNNNTEGNELTNGHGQESGVVTRGSPTQSRSSSKKSNASNGTPSSHSIRTDALRVEQTTNDHGDIKKGKSMENNRRKKTALFTALRK} NOT REAL!

TKKNTKDTDSLGALAASRGSLAHPDVTNQVVYMKDIVCYLSLLEGGKPEDKLEFMFRLYDTDDNGILDSSELDCIVNQMMHVAEYLGWDVTELRPILQDMMIEIDFDSDGTVSLEEWIRGGMTTIPLLVLLGLESNVKDDGSHVWRLKHYNKPAYCNLCLNLLVGFGKQGLSCTFCKYTVHERCVQRAPACCISTYVKSKRTTNIMNHHWVEGNSPGKCDRCKKSIKSYNGITGLHCRWCKITLHNKCASHVKPECNMGEFRDHILPPTAICPAVLVSLVTF
###Gene_Info_Comments GLEAN3_23664 ###
internal exon appears to be incorrect; indicated in sequences below:
ATGCCGAAGACGTACAAGCTGGAGTACTTTAACCTTCGTGGTCGCGCGGAATTGTCCCGTCTGCTTATGGCACAAGCTGATATGAAGTATGAAGATGTTCGGCTCTCATTCGCCGATTGGGGGACAGCAAAGGGAAATCAAGATAAGTACCCCTTGGGATTTCTGCCAGTGCTGGAGGAAGATGGAAAAGTCATTTCACAGAGTATGACCATCGCTCGTCATCTGGCGAGAGAGTTTGGAATGGCGGGACAGAATGAAGAAGAAATGGTAATGATTGATATGATTTGTGAGACGTGTAACGAATTGCTGAGCAAGATGATTGAGATAGCCCTGATGCAAGGCGAAGCC

{AAGCCAAACGCGGTGAAAGAGTTTACGGAAGTCAAATCTCTACTCCCTATGAAAAATATTACAACATGGCTGGAGATGAACGGCAAGGGAAATGGATACTTCGTGGGCGAA} not real

AAGATGTCGGTGGCAGACCTTTTCGTCTTCAGCATCATGGAACACCTCTCTGGAAAATACCCAAATATCCTCACCAAGCAACCCCTTCTCCAAGCCTTCTATGAGAGAATGATGAAGGAACCCAAGTTAGCTGCCTGGATCGTGAAGCGTCCAAATCAAGATATAGATATCTAA

Protein:
MPKTYKLEYFNLRGRAELSRLLMAQADMKYEDVRLSFADWGTAKGNQDKYPLGFLPVLEEDGKVISQSMTIARHLAREFGMAGQNEEEMVMIDMICETCNELLSKMIEIALMQGEA

{KPNAVKEFTEVKSLLPMKNITTWLEMNGKGNGYFVGE} not real 

KMSVADLFVFSIMEHLSGKYPNILTKQPLLQAFYERMMKEPKLAAWIVKRPNQDIDI
###Gene_Info_Comments GLEAN3_05626 ###
Only the 5' part of this GLEAN is homologous to this kinase; this is noted in the sequences below. The C terminal protein sequence does not independently produce a significant BLAST hit. 
ATGAACATTGATGAAAAGCTTACAGCAAAGCAACGTGAGGAAAGCAAAACTAAGATTCGGCATAAAGCAGATGGCCGTGTGATGGTGGCAAAGATCGGCAAGCAGAAGATATGGAGAGCCAACCAGCACAAAATGATCCAAGAACTTGAGCTTCTCAATAAACTACAACACCCTAACGTTGTCAGATATATGGGAGCTTGTGTAAAAGATGGCCATATCCATCCTGTACTCGAGTATGTATCTGGTGGATGCTTGACGGACATCTTGGCTGATGAGAGCTTGGCGTTGTCATGGAGACAGAAGGGTGACTTAGCAACAGACATCGCTCGTGGAATGACCTACCTCCACTCACAGAATGTGTGTCATCGAGATCTCACGTCAGCGAATTGCCTCGTTCGTCAAAAGCCGAATAATGTCCTCGAGGCCATACTCACCGACTTCGGCCTCGCTCGTGTGCTCGGCTGCATGCCCGACCCTCCTCCAAACTCTCCCAGAACGCCCGAGTCTCCAGAACCGGACATAATCGACGCACCGAACGGTGGTCCGATGTTGCCTCGGATACCGTCGGCCTGCATGGACGTGCCTCGGAAGATGTCGGTTGTCGGCACCGCGTTCTGGATGGCTCCCGAAGTTTTACGAGGAGAGGAATACACTCGCCAAGTGGATGTTTTCTCGTTTGGTATCGTGGTATGCGAGATTGTAGCAAGAATAACGGCCAATCCAGACGACCTCCCGAGGACTGGGAAGTTCGGTCTCGACCTGCAGCTTTTCAAAGAGAAATGTCCAGGGATACCTGAACCCTTCCTACAGATCGCTGAAGACTGTTGTTCCATGGATCCCAGGGATCGGCCGGTTTTCGCCGAGCTCGTCCGCCGTTTCGAGATCATCCGCGGTACGTTGGACACAGAAACGAGCGACACAACGTGTTACGATGTCAACCTTACGGACATCATCAGGACAAACGATTCGGACGATGATGACGATGATTGTAGCTTTGGGTTTCAGTTTCAAACGGATCTGGATGACAAACGCAGAAGTGGACGAACAAGATTGCGAGAAGTCCTTTGTGGATGTTCTAAAGGA

{TGGATGACAAACGCAGAAGTGGACGAACAAGATTGCGAGAAGTCCTTTGTGGATGTTCTAAAGGAGTACGCTTATGCTGCAAGACAGTATTTCAGGGTTGTTACGGGTTTCATCGTCTGCTTGTTGCCGATGCTTTCACTGTGGATGTATTGTGATTTGAACAATGTCCTCCTGTGGACCTCATTAATATATGCAAGCATAGGGTTGGTCTTAGAGAACTCGACAAGATGTGTAGAAGTGCTTCATTCTTCTCAATCAATCTTCAGGACTTTGTTCAATCTAATATCCAGATTATTCTCCAGACTGTGGAATGTTGTATCGCTTTTGTTGCAAAGAGTGCATTGTACGAGATCCGCAGGCTCGGACCTGAACGAAAACGTGACGTACCCAAATCAAAATGGCGGTCCGACCAAATCGTTGCAACATGGTGATACCCCCAGCGAGGTCCTGAGGAATACATCAACCCGCGTTGCGGACTTGGTCCCAATCTTGAAGAATCGTCAGAAAGGTCCAGGACCCGGCGCCGAGGATCCAGAGTCGCAGGGTGCAAGAAAGAAGACGCTGCTTGCAAACGAATTGGAAGAAACAGAACTGGATAAGTTACATAATCCAAACATTAACAGGGTATTGCTGAATGGCCGCATGTCGCAAAAGCGAGTCAGCTTTTCTTTACAAAACAGTATTGACCGAGATGAAGGTGACCCAAATCCATGA} non-homologous

Protein:
MNIDEKLTAKQREESKTKIRHKADGRVMVAKIGKQKIWRANQHKMIQELELLNKLQHPNVVRYMGACVKDGHIHPVLEYVSGGCLTDILADESLALSWRQKGDLATDIARGMTYLHSQNVCHRDLTSANCLVRQKPNNVLEAILTDFGLARVLGCMPDPPPNSPRTPESPEPDIIDAPNGGPMLPRIPSACMDVPRKMSVVGTAFWMAPEVLRGEEYTRQVDVFSFGIVVCEIVARITANPDDLPRTGKFGLDLQLFKEKCPGIPEPFLQIAEDCCSMDPRDRPVFAELVRRFEIIRGTLDTETSDTTCYDVNLTDIIRTNDSDDDDDDCSFGFQFQTDLDDKRRSGRTRLREVLCGCSKG

{WMTNAEVDEQDCEKSFVDVLKEYAYAARQYFRVVTGFIVCLLPMLSLWMYCDLNNVLLWTSLIYASIGLVLENSTRCVEVLHSSQSIFRTLFNLISRLFSRLWNVVSLLLQRVHCTRSAGSDLNENVTYPNQNGGPTKSLQHGDTPSEVLRNTSTRVADLVPILKNRQKGPGPGAEDPESQGARKKTLLANELEETELDKLHNPNINRVLLNGRMSQKRVSFSLQNSIDRDEGDPNP} non-homologous

###Gene_Info_Comments GLEAN3_21174 ###
One of 2, duplicate of GLEAN3_04006. This glean is much shorter and is identical (protein level) to 04006, but not on the ends. This is indicated in the sequences below

{ATGGCGGAAGACCTGCTGCATCCAGGCGCCATCGTCAAAGATCGATGGAAAGTTACCAAGAAAATTGGTGGTGGAGGCTTCGGTGAGATCTACGAAGCCCTTGACCAAGTCATTGATGAGTGCGTAGCCATCAAGCTAGAATCTGCTCTCCAACCTAAGCAGGTGCTCAAGATGGAAGTTGCCGTCCTCAAGAAACTTCAGGGG} incorrect?

CGGGATCATATCTGCAAGTTCATAGGCTGCGGTCGCAACGATCAGTTCAACTACGTTGTGATGACCGTCCAGGGCCAGAACCTTGCAGAGCTCCGCCGTGCACAGCCCCGTGGCACGTTCTCCGTCAGCACCATGCTCAGGCTTGGAGTACAGATTCTTGAATCGATAGAAAGCATCCACGAAGTGGGCTTTCTACACAGAGACATCAAACCTAGCAACTTTGCCATTGGGAAAGCTGCTGCTAACACAAGAAAGGTGTACATGTTAGACTTTGGTCTGGCAAGGCAGTATACCAATTCTCAGGGTCAAGTTAGAACGCCAAGGCCAGTTGCTGGGTTTCGTGGAACTGTTCGCTATGCTTCTGTCAATGCTCATAGAAATAGAGAGATGGGTCGTCATGATGATTTGTGGTCGTTGTTCTACATGTTAGTAGAGTTTGTCATTGGTCAACTTCCATGGAGAAAAATCAAAGACAAGGAACAAGTAGGTTTGTTGAAGGAGAAGTACGATCATCGTTTACTACTGAAACACATGCCCATGGAGTTCAAGCAGATATTAGAACAATTCCAGTCCTTAGAATATGCAGACAAACCAGATTACAAGTGCATCCATTCTTTATTAGAGCGATGTATGAACAGGAAGAATATCAAGGAGAATGATGCCTATGATTGGGAAAGACCACCAGTAGATGGAACTCATAACTTACTTCCTTCTTCAACTAGTCCTGCTCGA

{AGGAAGACTATTTTTTCTCCTAAACATCTCTTTTGGACGATTTGTAGGATTTAA} incorrect?

{MAEDLLHPGAIVKDRWKVTKKIGGGGFGEIYEALDQVIDECVAIKLESALQPKQVLKMEVAVLKKLQG} incorrect?

RDHICKFIGCGRNDQFNYVVMTVQGQNLAELRRAQPRGTFSVSTMLRLGVQILESIESIHEVGFLHRDIKPSNFAIGKAAANTRKVYMLDFGLARQYTNSQGQVRTPRPVAGFRGTVRYASVNAHRNREMGRHDDLWSLFYMLVEFVIGQLPWRKIKDKEQVGLLKEKYDHRLLLKHMPMEFKQILEQFQSLEYADKPDYKCIHSLLERCMNRKNIKENDAYDWERPPVDGTHNLLPSSTSPAR

{RKTIFSPKHLFWTICRI} incorrect?
###Gene_Info_Comments GLEAN3_10118 ###
This sequence does not include the first exon, which is instead present on GLEAN3_11286. The correct sequence is below:
DNA:
GCAGAAGATACCCGGCAGAAAGGTTTGAGGGTAGCCATTAAGAAGCTGTCCAGACCGTTCCAAACAGTCATACACGCCAAGAGGACCTACAGAGAACTACGCCTTCTCAAACATATGAGACATGAAAATGTAATCAGCCTGCTTGACTGTTTCACCCCTGACCGAGTCAACTTCTCAGATGTTTACATGGTGACCCATCTCATGGGAGCCGACCTTAACAACATCATCAAGTGTCAGAAACTCTCTGATGACCATGTCCAGTTCCTCATCTATCAGGTTCTCAGGGGCCTCAAGTACATCCATTCTGCTGGTGTGATTCATCGAGATCTCAAGCCCAGTAACATAGCTGTCAATGAAGACTGTGAACTCAGGATCCTGGACTTTGGATTAGCACGTAGCACAGACGATGAGATGACAGGATATGTAGCTACCAGATGGTATAGGGCACCTGAAATCATGCTCAATTGGATGCACTACACTGAGAAAGTTGACATCTGGTCCGTAGGCTGTATCATGGCAGAGCTCCTCACACAGAAAACCCTCTTCCCAGGGTGTGATCACATAGACCAACTGAATAAGATCATTGCTATCACAGGGAAACCAGACGAGACCTTCTTACAGAAGATCGCAAGTGAGAGTGCAAAGACATACCTGATGAGCATGGCTGCCTACCCTAAGAGGGACTTCAGCACTATCTTCCTAGGGGCCAGTCGCAAGGCTGTCGATCTTCTGGAGAAGATGCTACAATTGGATGAGGACAGAAGGCTGAGTGCTGAGCAGGCTCTTCAGCACCCCTATCTGTCTAAGTACCATGATCCAGATGATGAACCAATTGCAGCCATGTTTGATGATAGTCAGGAGAACAGCGACATCGTAATAGATGAATGGAGACAACGCGTTTTGAAAGAAGTAACAGAATTTGTTGCAGACCCAGCTCCGATGGATTGA
Protein:
AEDTRQKGLRVAIKKLSRPFQTVIHAKRTYRELRLLKHMRHENVISLLDCFTPDRVNFSDVYMVTHLMGADLNNIIKCQKLSDDHVQFLIYQVLRGLKYIHSAGVIHRDLKPSNIAVNEDCELRILDFGLARSTDDEMTGYVATRWYRAPEIMLNWMHYTEKVDIWSVGCIMAELLTQKTLFPGCDHIDQLNKIIAITGKPDETFLQKIASESAKTYLMSMAAYPKRDFSTIFLGASRKAVDLLEKMLQLDEDRRLSAEQALQHPYLSKYHDPDDEPIAAMFDDSQENSDIVIDEWRQRVLKEVTEFVADPAPMD
###Gene_Info_Comments GLEAN3_24598 ###
Internal exon may be incorrect. See sequences below.

DNA:
ATGTTTACTACTCACCAGAAATCAGGCCACGCAGGAGGCGGGGGCGTGAAGCGAATCGAGATCTTCTTGACGATGGTGGAACCGACGATACCAGACAGGAGATTCTTGAAGGTTGTCGTGACCAACAACGCTAAGGTCCAGGATCTGATCGGTCTCATCTGCTGGCACTATGTCAACAAGGGTCTTCAACCAGAACTCAATAAATCTTCAGATGGTGAATCTAGGCTTGCGGTGGACTCGACACAAATCACCCTCCGACAGATCTTAACCAAAGCTCTCAGAAAAAGAAAGGGTATCATTCCAACTGCAGGACCTCATTATCTACTGGAGAAGAAGTCTGCACCAGGGGTTCCACTAGACCTGGACCTGAAACTCTGCGAAACAGAATCCATGGACTTCATTATGGTCAGAGAACATAG

{TAGAAGAGACTACCTGAAGAGCGACAGGACGCGACCCTACTCCGGCGATAGAGAGCCCCCTCTAGTGTTGAGTACTCAGTACCGATCGTTCCGAGTCAGCATGCTGCACAAGCTCAGACCTGCCACTGAGATCCAGCTAGGT} incorrect?

ATCTCAGGGGATAAGATTGAAATCGACCCGGTAGCCCAACCAAGGAACACGCCGGCCAAGTTCTGGGGCAAGCAGAAAGCCGTCTCCATCGAGTCCGACAGGCTTGCCTTCTGTAACATCACAGATGATAAACCATCAGGGAAGTCCACATTCCGACTGACATTCAAGAGTCCCAACCATGAGTTCAAGCACTACGACTTTGAGACCGGAACCAACCTCACCAAACAGATCGTCAACCGTATCAACCACATTCTCGAGCTTCGAGCAAGCTCCGTACGAAACGACTACACGCTGTGGAGAGAGAGGAGACATAGTAGAAAGGCCCACGATAAATAG

Protein:
MFTTHQKSGHAGGGGVKRIEIFLTMVEPTIPDRRFLKVVVTNNAKVQDLIGLICWHYVNKGLQPELNKSSDGESRLAVDSTQITLRQILTKALRKRKGIIPTAGPHYLLEKKSAPGVPLDLDLKLCETESMDFIMVREHS

{RRDYLKSDRTRPYSGDREPPLVLSTQYRSFRVSMLHKLRPATEIQLG} incorrect?

ISGDKIEIDPVAQPRNTPAKFWGKQKAVSIESDRLAFCNITDDKPSGKSTFRLTFKSPNHEFKHYDFETGTNLTKQIVNRINHILELRASSVRNDYTLWRERRHSRKAHDK
###Gene_Info_Comments GLEAN3_04964 ###
internal exon that is likely to be erroneous has been deleted in sequences below:

DNA:

ATGTTATGCCTTGGGATCGTGTCCAAGCAGGTTATCCGTGATGCCATCCTTCTCAATGACTTCACCAAGAACTTCGACAGCTCACAAACCCGCGAGATAGTGGAGTGCATGTTCCCTATCGACTATAAGAAGGGCCAAATAGTCATCAATGAGGGCGACTCAGGAGCACACTTCTACGTCGGAGCAACGGGTACCCTTGAGGTGAGCCAAGGTGATCGCGTCCTGGCCACTATGGGACCGGGAAAGGTCTTCGGGGAACTGGCCATCCTCTATAACTGCACCAGAACAGCCACCATCACTGCCGTCACTGACGCGCAAGTATGGGCGATCGATAGAAAAGTGTTCCAGCTGATCATGATGAAAACTGGGATGCAGCGCCATGAAGAGTATTTCAACTTCCTTAAGAGTGTGCCTTTGCTCAAAGATTTGTCTTCCGATAACCTCTTCAAGTTGGCGAACAGTTTGGAAGTAGACTTCTTCCATGAAGGTGAATACATTATAGTGGAGGGCTCCAGGGGAGATACCTTCTACATCATTAGTAAGGGGGAGGTCCGGATAACCCAATCCGTCCAAGGACAGAGAGAACCCCAGGAGGTTCGAAGCCTCCAGAAAGGAGACTTCTTCGGTGAGAAAGCGCTCCTTGGTGAGGACGTACGAACAGCGAATGTCTTGGCCAGCAAAGGGGGATGCGAGTGCTTGGCCGTTGATAGACAGTCTTTCAACGAACTGATCGGCAACATGCAGGCACTCCAGGACAAGAATTATGGAGACAAAGAAAGGGGAGCAACCAGGTCGAGCTCGGAGATGGATAATACAGAGATTGCACGAATCAAGCCGATACAAGATGAGCTAGCTGCTATACATCTCAACGATCTGGATATCATCGCTACATTGGGTGTTGGAGGGTTCGGTCGGGTCGAACTGGTTCAACTGGCAGGCGATAAGCGGACATTCGCCCTCAAGTGTTTGAAGAAACATCACATCGTAGAAACTCGGCAACAGGAACATATCTTTTCTGAGAAGAAGATCATGATGGAATCTAGCTCCCCCTTTATTGTCAAATTGTTCAAGACATTCCGTGATCAGAAGTATATCTACATGCTTATGGAAGTCTGCTTAGGAGGAGAGCTCTGGACTATCCTCAGGGACAAGGGTCATTTTGATGACCGGACAGCAAGGTTTTCCACCGCATGCGTAGTTGAAGCTTTCCACTATTTGCACAGTCGCGGCATCGTCTACCGCGATCTCAAGCCTGAGAATCTGCTCCTTGACAACAAAGGCTACGTCAAATTGGTCGACTTTGGTTTCGCGAAGAAGATCGGTTTTGGTCGTAAGACCTGGACCTTTTGTGGCACTCCCGAGTACGTGGCACCGGAAATCATCCTCAACAAAGGTCATGACCTGTCATGTGACTACTGGTCCCTGGGAATCCTCATCTTTGAGCTTTTGACCGGAAATCCGCCATTCACTGCCAATGATCCCATGAAGACATACAACGTTATTCTAAAGGGTATCGACATGGTCGAGTTTCCACGGAAGATTCCTCGTAGTGCTGGTAACCTTATCAAGCGACTCTGTCGGGACAATCCAGGCGAGAGAATCGGCTACCAGAAGAATGGCATTAGTGATATCAAGAAGCACAAATGGTTCCAAGGTTTTGACTGGGAAGGTCTCAGGAAGCAAGAAATTGCCGCCCCTCTTCCTCCAAAGGTGAAAGGCTCAAGCGACTGCAGCAACTTCGACAGCTACCCGAAAGATGTCGATATCCCGGCCGATGAAACGTCGGGATGGGACGAACACTTTTAA

Protein:
MLCLGIVSKQVIRDAILLNDFTKNFDSSQTREIVECMFPIDYKKGQIVINEGDSGAHFYVGATGTLEVSQGDRVLATMGPGKVFGELAILYNCTRTATITAVTDAQVWAIDRKVFQLIMMKTGMQRHEEYFNFLKSVPLLKDLSSDNLFKLANSLEVDFFHEGEYIIVEGSRGDTFYIISKGEVRITQSVQGQREPQEVRSLQKGDFFGEKALLGEDVRTANVLASKGGCECLAVDRQSFNELIGNMQALQDKNYGDKERGATRSSSEMDNTEIARIKPIQDELAAIHLNDLDIIATLGVGGFGRVELVQLAGDKRTFALKCLKKHHIVETRQQEHIFSEKKIMMESSSPFIVKLFKTFRDQKYIYMLMEVCLGGELWTILRDKGHFDDRTARFSTACVVEAFHYLHSRGIVYRDLKPENLLLDNKGYVKLVDFGFAKKIGFGRKTWTFCGTPEYVAPEIILNKGHDLSCDYWSLGILIFELLTGNPPFTANDPMKTYNVILKGIDMVEFPRKIPRSAGNLIKRLCRDNPGERIGYQKNGISDIKKHKWFQGFDWEGLRKQEIAAPLPPKVKGSSDCSNFDSYPKDVDIPADETSGWDEHF
###Gene_Info_Comments GLEAN3_14574 ###
This is the N terminal part of DGK-beta; GLEAN3_14575 encodes the C terminal part. The sequence has been modified below, since the 3' part of the original sequence is actually non-coding, as has been indicated below.
DNA:
ATGCCTGCTTGTAAGGGAACATGGCCTTTCTCTCAGTCCTATGCCTCCCTCGGCCAAACGGACCAAAAGAGACAGAACTCCAAGAAAGAGCGACCGAGCTGGAAGTACCGTCTCTTTCGCAACTCCAAACGAGGCCGGCAGAAAAAGGAAAAGGATAATAGGATTTCGAAACCCCTTTTTGTGCCTTGTACCTCGGCCCTTGTGATGGCGCTGCAGGGCCAAGCTCAGGATGTAGTGGATATCGTCATG

{CCTTGTCCTTTCAAGACCAATGCTGGCACCTTAATGTTGTTGAAGCGTGTCACATCTAATCCCCCGAAGCCGTATGTCCCGCTTCTGCATTTTCCTTCACTGGAATTCGCAGTTTACACGGGACCCCTGACCTTCTTGGTCTTGACACCGATCAATTTAAGCATCACTCCGGGGTAA} Not Real!

Protein:
MPACKGTWPFSQSYASLGQTDQKRQNSKKERPSWKYRLFRNSKRGRQKKEKDNRISKPLFVPCTSALVMALQGQAQDVVDIVM


{PCPFKTNAGTLMLLKRVTSNPPKPYVPLLHFPSLEFAVYTGPLTFLVLTPINLSITPG} Not Real!
###Gene_Info_Comments GLEAN3_23009 ###
this protein lacks a start codon and has several small regions that are possibly spurious, and one missing short internal sequence. These are indicated in the protein sequence below:
VPQPYFNLKKRISEEVEVRQKADPPILPIMTKTELADDIVKLIPDSGLRTPIELQEAILFLHEIGSIVHFTDHLNGLNDLYFIDPVWLACTLQRVTALPIGSLKGGKVHVETLRELSKKSSIEEDKFEQYLQLLARFEIVVPISHHWYLVPARLPRDNPGVMLSPHNTDDAPFHYLRRIYKMPYLPPGFWIRLVSRLIADLQMRDKKKKISSAGNERNLGSSKR
{LTDDEMFPFQR} spurious?
KSSISEAISFHQTESIYWREGIFFRHNTGQILVRSMVFPSTDKSPGVDILISCQEGHFSAMGCVVDQIEGLIKDWYP
{GLCTSIHESIQPKVQRLVPCPICVIHGPDDFEPVT} spurious?
EDLPHCYTVEELAQTYVRGETHITACANSKEKPITISVLIPDMFMKDLSIRHFKQEDFTLHMVSGQSLGQGGFGEVFRAKFRGETVAAKTMLPSRLLKNRMFSSASEGYASCASTSSSTSNRTGESTSTENDSLEAAMLMESFHKLRNEVAIMAKLDHPYIVNLVGVSIRHLCFAMDYAPLGDLRSYLFAEHQSARPHFVKRNIVLEPVLSRMLTYKISLQVASAVGYLHRKDIIYCDLKTDNILLFSSDVNEDVNIKLIDYGISKKYDLMGAMGMAGTPGFCAPEILQGKTFDEKVDWFSYGMFLYHLMTGLVPYYDQHSRIEIELAVNEGRKPTFNFHEYTMPPKQVFPALGALMESCWQNKPGERPHGETTLQLLSEPSFLCLRRVVEVEEEEGVSLAFSQGSQDE
{LQAFAKQPLAVNANGCKTASLFVI } this is missing from the GLEAN
DKVVHLIVESGRGTSVRSFQVDEDGCYKSSLLNELQCPMIRTAIATPCGTKIVVGTGGDCVQLYHLPSSHSSHASLLVEARVAGQPTSLHYIQKPSGQEHSLLFVGQANGVLTVLSHETEDSGHHITDDLKLVTRMQLSKHNLPCSSIVAVSRKKNGDSMAEQRRRYEEVVYNGANGVWNRSAKSNGTREERRDETTARKMRGGRNSLEPRERTGGSRDEADGTEVWVGCGNKLRIILLDDITLEPDGIQVAAGMEGIIEGIVQSQGSVWCFTSSALYVYQYSTETRSCLAILDCRESILVPGSFLPLYQEKRQEL
{VRSWEEKREKEQELASATA} spurious?
ERTVNIIRPRSVGQLSVFAYKLARRPQF
###Gene_Info_Comments GLEAN3_11785 ###
The terminal part of this GLEAN is actually the first exon of a GST protein. Modified seqs are below:
DNA:
ATGGCTTTAGCTCAGGAGGAGCTGACCATGATGAAGGGGAAGATCAACAGCCAGAAAGAGATGGGTAAACAC AAACTGGAAGGAGAACTTACCCAGCTGAAAGAGGTTCCAAGTTACCATTCAGATCTGTGCACGATTCCCAGT GTTGTTGCTGATGTCTGCAGTGACACAAGAGACCTCTCAGGGTTGGACATTGCTAAAAAGCGTGTGGCCCTG ATGGAACTCCTGCAGAAGCAGTCACAGCGTAAGGCACTAGCTGCCCTGGAAAGGCGACAGGCAACCCTTGAG GTGCAAGAAAGAACTGCTCAATTGATTCTGAAACATGAGGAGCAAAAACTTCAGGATTTTGAAACTGATCCA CGTTTGCTTGAGGGAGAGTTCGGTTCAGAGCAGTCACCTGAAAAAGAGCTTGGAGATCGTGACATTGAGAAC ACAGTGGATAGTTCTACAGAAAATGTGAGCACAGAATTGCCAGCATCAGGTCGAGTCAGTCCTGTAGGAAAC AGTGACCCTGAAACAATGCATCACAACAATGTTCCTTCTCCGAAGAAAGAGACTAAGCGCATCCAAAAGCAG GAAAACAAAGTTAAGACTATGGATCAAAAGAAGGACCAAACAAACAGTTCAAAGAAAAATTCCAGAAAAACT GATGCTAGACCTACTAGAAAGGTTGAATTAGGTCCTTCAAAAACTAAACCCAGGGAAGAACAAATAAAGAGG GAAAGTGATGGTCTATCTGCATCACCTTCAAGGTCTCCATCTCCTCCCAAAAGTTCAAGGTCAAGCTCGCCA AGTAAATCCTCGAGGTCCAGCTCACCCACCAAGTCACTAAGATCAGATTCACCAACAAAATCCTCAAGGTCC AACTCACCCACCAAGTCACTAAGATCAGATTCACCAACAAAATCCTCAAGGTCCAACTCACCCACCAAGTCA CTAAGATCAGATTCACCAACAAAATCCTCAAGGTCCAACTCACCCACCAAGTCATTAAGATCAGATTCACCA ACAAAATCTTCAAGGTCTGCATCTCCAACCAAATCTGTAAGGTCTGAATCACCGACCAAGTCATCAAGATCA TCTTCTCCTGCAAGTACCTCCTCGAGGCAGTCCAGGCAAACTTCAGGAAATGTTTTTTCTCGCCTGTATCCC CAGCAAGAAACAAAATTTAATTTTCTCAGGAAGAAGAGTCCTCCAAGATATGGTGAATATGATCCTCACAGA GAGAGAGACAGAGACTCTCCACACTTGATCAGGAGAGAACTGTCTCCTGTAGAGCATGACCCTATCAAGTTA AAGGACAAAAAACGTGAACATGTTCAGGACCATAAGCATGTGAGAACCAGCAGTGTGCCTTCAAGCAATCTT CCAGACAGAACAAAGCAAAATCTTCATTCTAGAAGTAGGAGTGCTAGTCCATGTGTAAATAATCCTGTAAAA CCAGCAACAAGGCCAAAGAAATCTCCAGTATCACAGTCCTCACAAGAAACTGAAAAGGCGGTTAGGAGAACT AAAGTGAAATCAAAAGACACAGCAAGTGCTCCACTTGATAGAACTCAAACTGATCCTGGTAATGAAAAACAA AATAATAAGTGCCAAACCAAAACCAAGCAAGCCCAACGCACATCTCCAAGTGAATCAACCCACTCCAGTCCA ACTAAAAGGAAATTAAATAAACCAGAAAACACTTCAAATGCAAAGACCCCTAAAGGCTCATCTGCCTCTCCT CGGTCAAGACCAATCATTGGGAAAGCAGTTGATAGAAGTCCATCTCCAAGGAAGAAATCAGAACAAACTCCA TCTCACCGGAAGACTGTAAAGAGGCCCCTTTCTAGGAGTCGCAACGATAAGGATGGTGAAAACACAAATTCT CCTAGTCCTAGGAGGAAATTTGCAAGAACTCCCGTTGTCAAACCAAAGAACTGGCCGTCCATAGAGAACTTA CCAAAGAGTACTCCCAAATCTGAAGCTTTCTATGTACCTCTCACTAGCGAACAGTTGAGGCTTGCACTTCAG AAGCACGCCAGTGAGCAGGACAGTGGACCACATCTGGAAGGTGCAGATTTGGGGGCTTCAGGATGTGAACAG CACTGGAGTCCCAGTCGTAAGAGGTCTATTGGAAAGCAGCCCCAGCAAACACAAACAGATCCAATTCAGGCC AGCATAGGAGGAGAGATATTTGGTTTGGAAATGGATGGAGTAAACAATTTAGAAAATGAGGAAGATCACTAC AGCTCCTCAGAA
Protein:
MALAQEELTMMKGKINSQKEMGKHKLEGELTQLKEVPSYHSDLCTIPSVVADVCSDTRDLSGLDIAKKRVAL MELLQKQSQRKALAALERRQATLEVQERTAQLILKHEEQKLQDFETDPRLLEGEFGSEQSPEKELGDRDIEN TVDSSTENVSTELPASGRVSPVGNSDPETMHHNNVPSPKKETKRIQKQENKVKTMDQKKDQTNSSKKNSRKT DARPTRKVELGPSKTKPREEQIKRESDGLSASPSRSPSPPKSSRSSSPSKSSRSSSPTKSLRSDSPTKSSRS NSPTKSLRSDSPTKSSRSNSPTKSLRSDSPTKSSRSNSPTKSLRSDSPTKSSRSASPTKSVRSESPTKSSRS SSPASTSSRQSRQTSGNVFSRLYPQQETKFNFLRKKSPPRYGEYDPHRERDRDSPHLIRRELSPVEHDPIKL KDKKREHVQDHKHVRTSSVPSSNLPDRTKQNLHSRSRSASPCVNNPVKPATRPKKSPVSQSSQETEKAVRRT KVKSKDTASAPLDRTQTDPGNEKQNNKCQTKTKQAQRTSPSESTHSSPTKRKLNKPENTSNAKTPKGSSASP RSRPIIGKAVDRSPSPRKKSEQTPSHRKTVKRPLSRSRNDKDGENTNSPSPRRKFARTPVVKPKNWPSIENL PKSTPKSEAFYVPLTSEQLRLALQKHASEQDSGPHLEGADLGASGCEQHWSPSRKRSIGKQPQQTQTDPIQA SIGGEIFGLEMDGVNNLENEEDHYSSSE
###Gene_Info_Comments GLEAN3_06632 ###
The expected zinc fingers were not predicted in the GLEAN3 model.
###Gene_Info_Comments GLEAN3_10672 ###
The N-terminal region (amino acids 1-162) of the GLEAN model show sequence match to heat shock protein.  The presence of pfam H2TH domain and C-terminal sequence similarity are to Neh-2 annotated GLEAN3_06632. 
###Gene_Info_Comments GLEAN3_01388 ###
alpha thalassaemia mental retardation X-linked protein
###Gene_Info_Comments GLEAN3_02329 ###
In addition to Ercc6 homology the GLEAN3 model has an addition N-terminal region with sequence match to similar to Galactosylceramide sulfotransferase (GalCer 
sulfotransferase) (Cerebroside sulfotransferase) (3-phosphoadenylylsulfate:galactosylceramide 
3-sulfotransferase) (3-phosphoadenosine-5phosphosulfate:GalCer 
sulfotransferase) 

###Gene_Info_Comments GLEAN3_20070 ###
This model was annotated based on a manual inspection of protein alignments and domain structures. The features of this glean model are supported by other predictions and genome-wide tiling array embryonic hybridization data.


###Gene_Info_Comments GLEAN3_09913 ###
There is notable sequence match to GLEAN3_07783.
###Gene_Info_Comments GLEAN3_10307 ###
binds to CREB-binding protein (CBP); related to Snf2 family of proteins (by similarity).
###Gene_Info_Comments GLEAN3_12027 ###
Conserved domain DEAD/H box 1 identified as expected for smarcad homolog

cd00046, DEXDc, DEAD-like helicases superfamily.
scored 75.9  expectation 1e-14 
###Gene_Info_Comments GLEAN3_02950 ###
partial sequence of glean3_12238, different scaffolds
###Gene_Info_Comments GLEAN3_15432 ###
Protein involved in transcription-coupled repair nucleotide excision repair of UV-induced DNA lesions; homolog of human CSB protein; Rad26p [Saccharomyces cerevisiae] --(By Similarity).

###Gene_Info_Comments GLEAN3_19459 ###
Human Chrom-1 sequence match confined to residues 750 to end of the GLEAN3 model.  The N-terminal region of the model may be more similar to other isoforms of the same family.
###Gene_Info_Comments GLEAN3_19921 ###
Similar to Saccharomyces cerevisiae RAD26, Homo sapiens ERCC6 and chromodomain helicase proteins of the SNF2 family.
###Gene_Info_Comments GLEAN3_24818 ###
Related to GLEAN3_28391 with higher coverage of the Query used.
###Gene_Info_Comments GLEAN3_07862 ###
This gene was annotated based on a manual inspection of multiple protein sequence alignments and domain structure.

The structure of this model is supported by the genome-wide tiling array embryonic hybridization experiment and by very similar models generated by other gene prediction protocols.

Its structure resembles that of a partial Cbl gene, with most of the N-terminal domains of Cbl genes missing from this model, and we have therefore named this gene "Sp-Cbl-related 1". Sp-Cbl [GLEAN3_07863] is located immediately upstream of this gene and in the opposite orientation. Given that both models map to a large region of uninterrupted sequence, that they are in opposite orientations and the strong correlation with the tiling array hybridization data, we believe it is unlikely these models represent an assembly error but that they may represent a true localized gene rearrangement event. Nonetheless, additional experimental data are needed to confirm these observations.
###Gene_Info_Comments GLEAN3_28332 ###
SWI/SNF-related matrix-associated actin-dependent 
regulator of chromatin subfamily A member 3.

TNF-response element binding protein.
###Gene_Info_Comments GLEAN3_28391 ###
Related to GLEAN3_24818 as a subset.
###Gene_Info_Comments GLEAN3_02003 ###
Partial prediction. Missing the last 150 AA from human protein.
###Gene_Info_Comments GLEAN3_07863 ###
This gene was annotated based on a manual inspection of multiple protein sequence alignments and domain structure.

The structure of this model is supported by the genome-wide tiling array embryonic hybridization experiment and by very similar models generated by other gene prediction protocols.

Its structure resembles that of a partial Cbl gene, with the most N-terminal Pfam Cbl_N domain missing from this model.

A closely related model [GLEAN3_07863] is located immediately upstream of this gene and in the opposite orientation. Given that both models map to a large region of uninterrupted sequence, that they are in opposite orientations and the strong correlation with the tiling array hybridization data, we believe it is unlikely these models represent an assembly error but that they may represent a true localized gene rearrangement event. Nonetheless, additional experimental data are needed to confirm these observations.
###Gene_Info_Comments GLEAN3_11695 ###
Likely missing a terminal exon (or more).
###Gene_Info_Comments GLEAN3_13568 ###
Close match to GLEAN3_23027.
###Gene_Info_Comments GLEAN3_23027 ###
Close match to GLEAN3_13568.
###Gene_Info_Comments GLEAN3_03313 ###
This gene model has been modified by adding Glean3_03312 to the 5' end.  This sequence blasts to patched most highly,  but is most similar to human Niemann-Pick C2 by phylogenetic analysis.
###Gene_Info_Comments GLEAN3_03312 ###
This gene model has been modified by adding it to the 5' end of Glean3_03313.  This sequence blasts to patched most highly,  but is most similar to human Niemann-Pick C2 by phylogenetic analysis.
###Gene_Info_Comments GLEAN3_03472 ###
The GLEAN3 model does not cover the c-terminal helicase region of the mouse Brip-1 Query.
###Gene_Info_Comments GLEAN3_09499 ###
The GLEAN3 model does not cover the c-terminal region of the Trel-1 Query.
###Gene_Info_Comments GLEAN3_12100 ###
The GLEAN3_12100 sequence coverage is limited to the N-terminal 250 amino acids of the Query sequence presumably due to the short sequence length of the GLEAN3_12100 model.

The GLEAN3_12100 sequence is contained within GLEAN3_23149.
###Gene_Info_Comments GLEAN3_23149 ###
Glean3_23149 is "contained" within this GLEAN3 model.
###Gene_Info_Comments GLEAN3_28874 ###
Very similar to GLEAN3_13756.
###Gene_Info_Comments GLEAN3_02790 ###
Appears to be identical to GLEAN3_22735.
###Gene_Info_Comments GLEAN3_22479 ###
Appears to be identical to GLEAN3_04237
###Gene_Info_Comments GLEAN3_10625 ###
C-terminal half, probably missing an exon in the middle
missing N-terminal half
###Gene_Info_Comments GLEAN3_05083 ###
ectopic transmembrane domain at the N-terminus
###Gene_Info_Comments GLEAN3_10032 ###
shorter than expected, missing N-terminus?
###Gene_Info_Comments GLEAN3_22285 ###
SRCR(3). Probably incomplete. See GLEAN3_22286, 22287, 22288, 22289.
###Gene_Info_Comments GLEAN3_22286 ###
SRCR(10)-TM. Probably incomplete. See GLEAN3_22285, 22287, 22288, 22289.
###Gene_Info_Comments GLEAN3_22287 ###
SRCR(4)-TM. Probably incomplete. See GLEAN3_22285, 22286, 22288, 22289.
###Gene_Info_Comments GLEAN3_22288 ###
SRCR(3). Probably incomplete. See GLEAN3_22285, 22286, 22287, 22289.
###Gene_Info_Comments GLEAN3_22289 ###
SRCR(4)-TM. Probably incomplete. See GLEAN3_22285, 22286, 22287, 22288.
###Gene_Info_Comments GLEAN3_22339 ###
SigPep-SRCR(3)-TM.
###Gene_Info_Comments GLEAN3_22423 ###
SRCR(4). Probably incomplete. See GLEAN3_22424.
###Gene_Info_Comments GLEAN3_22424 ###
SRCR(8)-Sushi(2). Probably incomplete.  See GLEAN3_22423. Like >gi|8547249|gb|AAF76319.1|AF228827_1 scavenger receptor cysteine-rich protein [Strongylocentrotus purpuratus]. 
###Gene_Info_Comments GLEAN3_22528 ###
SRCR(9)-EGF-SRCR(5). Possibly incomplete.
###Gene_Info_Comments GLEAN3_22567 ###
SRCR(5)-TM. Probably incomplete.  See GLEAN3_22568, 22569.
###Gene_Info_Comments GLEAN3_22568 ###
SRCR(5). Probably incomplete.  See GLEAN3_22567, 22569.
###Gene_Info_Comments GLEAN3_22569 ###
SRCR(3). Probably incomplete.  See GLEAN3_22567, 22568.
###Gene_Info_Comments GLEAN3_22814 ###
SRCR(4). Probably incomplete.
###Gene_Info_Comments GLEAN3_23641 ###
SigPep-SRCR(4)-TM.  
###Gene_Info_Comments GLEAN3_23677 ###
SRCR(5)-TM. Possibly incomplete.
###Gene_Info_Comments GLEAN3_23840 ###
SRCR(2). Probably incomplete.
###Gene_Info_Comments GLEAN3_23153 ###
Very similar but not identical in sequence to GLEAN3_23152.
###Gene_Info_Comments GLEAN3_25824 ###
Groups with caspase 9 subfamily in neighbor joining of multiple sequence alignment.  Model may be missing an exon, as the predicted protein contains a CARD domain, but no capsase (peptidase C14) domain.  Similar but not identical to N-terminus of GLEAN3_00882.
###Gene_Info_Comments GLEAN3_23991 ###
SRCR(2). Probably incomplete.
###Gene_Info_Comments GLEAN3_24084 ###
SRCR(4). Probably incomplete. 
###Gene_Info_Comments GLEAN3_24390 ###
SigPep-SRCR(2)-WSC-TM.
###Gene_Info_Comments GLEAN3_24408 ###
SigPep-SRCR(7)-TM.
###Gene_Info_Comments GLEAN3_24440 ###
SigPep-SRCR(4)-TM.
###Gene_Info_Comments GLEAN3_24487 ###
SRCR(9). Probably incomplete. Like gi|8547243|gb|AAF76316.1|AF228824_1 scavenger receptor cysteine-rich protein variant 1 [Strongylocentrotus purpuratus] and >gi|8547245|gb|AAF76317.1|AF228825_1 scavenger receptor cysteine-rich protein variant 2 [Strongylocentrotus purpuratus]
###Gene_Info_Comments GLEAN3_25862 ###
SRCR(3). Probably incoplete. See GLEAN3 25865.
###Gene_Info_Comments GLEAN3_25865 ###
SRCR(3). Probably incomplete. See GLEAN3_25862.
###Gene_Info_Comments GLEAN3_25968 ###
SigPep-SRCR(3). Probably incomplete.
###Gene_Info_Comments GLEAN3_25983 ###
SRCR(27). probably incomplete.
###Gene_Info_Comments GLEAN3_26234 ###
SRCR(3). Probably incomplete.
###Gene_Info_Comments GLEAN3_26241 ###
SRCR(2). Probaly incomplete.
###Gene_Info_Comments GLEAN3_26408 ###
SigPep-SRCR(3). Probably incomplete.
###Gene_Info_Comments GLEAN3_26709 ###
SigPep-SRCR(4). Probably incomplete.
###Gene_Info_Comments GLEAN3_26848 ###
SRCR(8). Probably incomplete.  See GLEAN3_26849.
###Gene_Info_Comments GLEAN3_26849 ###
SRCR(14)-TM. Probably incomplete. See GLEAN3_26848.
###Gene_Info_Comments GLEAN3_27037 ###
SigPep-SRCR(2). Possibly incomplete.
###Gene_Info_Comments GLEAN3_27287 ###
SRCR(4). Probably incomplete.  See GLEAN3_27288.
###Gene_Info_Comments GLEAN3_27288 ###
SigPep-SRCR(17). Probably incomplete. See GLEAN3_27287. Like >gi|4165053|gb|AAD08654.1| scavenger receptor cysteine-rich protein type 12 precursor [Strongylocentrotus purpuratus].
###Gene_Info_Comments GLEAN3_27379 ###
SRCR(9). Probably incomplete.
###Gene_Info_Comments GLEAN3_27503 ###
SigPep-SRCR(3). Probably incomplete.
###Gene_Info_Comments GLEAN3_27619 ###
SRCR(5). Probably incomplete.
###Gene_Info_Comments GLEAN3_28233 ###
SigPep-SRCR(3). Probably incomplete.
###Gene_Info_Comments GLEAN3_28382 ###
SRCR(3). Probably incomplete.
###Gene_Info_Comments GLEAN3_28612 ###
SRCR(6). Probably incomplete.
###Gene_Info_Comments GLEAN3_28669 ###
SigPep-SRCR(3). Possibly incomplete.
###Gene_Info_Comments GLEAN3_28680 ###
EGF_CA(6)-EGF-SRCR(2)-EGF(2).
###Gene_Info_Comments GLEAN3_28804 ###
SRCR(3). Probably incomplete.
###Gene_Info_Comments GLEAN3_17888 ###
partial
###Gene_Info_Comments GLEAN3_08981 ###
From Best Accession annotation -
"OB-fold nucleic acid binding domain. This family contains OB-fold domains that bind to nucleic acids. The family includes the anti-codon binding domain of lysyl, aspartyl, and asparaginyl -tRNA synthetases (See pfam00152). Aminoacyl-tRNA synthetases catalyse the addition of an amino acid to the appropriate tRNA molecule EC:6.1.1.-. This family also includes part of RecG helicase involved in DNA repair. Replication factor A is a heterotrimeric complex, that contains a subunit in this family. This domain is also found at the C-terminus of bacterial DNA polymerase III alpha chain."
###Gene_Info_Comments GLEAN3_02063 ###
GLEAN3_02063 is a partial duplicate prediction for GLEAN3_04403. 
###Gene_Info_Comments GLEAN3_18479 ###
PARTIAL
###Gene_Info_Comments GLEAN3_10770 ###
e val for AAH01211 = 1e-75; Kinesin family member C3 [Homo sapiens].
e val for NP_005541 = 2e-77; KIFC3 [Homo sapiens].
GLEAN3_10770 overlaps entire concensus motor domain when compared to human CENP-E, and has long C-terminal domain.
Annotation by RA Obar, RL Morris, AL Silverio, BJ Chick, and B Rossetti.





































































































###Gene_Info_Comments GLEAN3_13729 ###
e val = 7e-77 for AAH01211, Kinesin family member C3 [Homo sapiens].
e val = 1e-78 for NP_005541, KIFC3 [Homo sapiens].
Annotation by RA Obar, RL Morris, AL Silverio, BJ Chick, and B Rossetti.
###Gene_Info_Comments GLEAN3_19636 ###
PARTIAL

###Gene_Info_Comments GLEAN3_15809 ###
e val for NP_065867 is 8e-45.
Likely to be a fragment based on its short length
Annotation by RA Obar, RL Morris, BA Jeffrey, and B Rossetti.
###Gene_Info_Comments GLEAN3_15437 ###
e val = 6e-168 against NP_004511, and e-149 for NP_006836; KIF2C [Homo sapiens].
Annotation by RA Obar, RL Morris, AL Silverio, BJ Chick, and B Rossetti.













































###Gene_Info_Comments GLEAN3_20634 ###
e val for NP_004312, "axonal transport of synaptic vesicles" [Hs], is 3e-118. 
e val for NP_904325, kinesin family member 1B isoform alpha [Hs], is e-116
See also GLEAN3_18764.
Annotation by: RA Obar, RL Morris, BA Jeffrey, and IJ Strachan
###Gene_Info_Comments GLEAN3_21656 ###
CAA40175 is KHC cloned from purp.
Q66K46_HUMAN Q66K46 (UniProtKB/TrEMBL accession number)
e val = 0.0 for Q66K46
e val = 0.0 for NP_004512 KIF5B [Homo sapiens].
Through comparison with CENP-E (NP_001804.2) as defined by Pfam PF00225, N-terminus of motor domain is likely incomplete.  
Peptide length=1,077 AA. 
Annotation by RA Obar, RL Morris, AL Silverio, BJ Chick, SA Tower, LE Shorey, and AP Rawson.  

###Gene_Info_Comments GLEAN3_22296 ###
e val for AAH28155=7e-115, and against NP_015556 is e-117.
Annotation by RA Obar, RL Morris, BA Jeffrey, AM Musante.
###Gene_Info_Comments GLEAN3_23126 ###
e val for NP_001804 = 2e-101, and for Q02224=e-139; CENPE_HUMAN [Homo sapiens].  
Kinesin-7 family member.  
See also Glean3_17809 which also hits Q02224.
CENPE_HUMAN data obtained from UniProtKB/Swiss-Prot entry Q02224.   
Annotation by RA Obar, RL Morris, SA Tower, KM Judkins
###Gene_Info_Comments GLEAN3_12645 ###
Inspection of the tiling array suggests that glean may have missed the following exons: TSVTQISRPLSLSLPLSSVAIHTFPSGSFPLYSSPYSPSLPLLFRCLHLSLCLLSLTLFSPFSFSPFSNYSFPSLHLPPSQSDT,ARIFHGRFAILLLGKWSLRDSERPKIILLWRGGKTARIIVLLLCFLNPHESYMIYDNLNIDLHEHELQLLRSVGLSLSLSLSPQLQYTPSLLVRSPCIPLRILLLYLFCFAVSISHFVCSV
###Gene_Info_Comments GLEAN3_26237 ###
e val for CAI43180 and for NM_024704 is 0.0. 
Annotation by: RA Obar, RL Morris, BA Jeffrey.
###Gene_Info_Comments GLEAN3_09940 ###
e val for NP_05541 is 3e-40.
Eval  = 5e-40 against ?CAK04214.1|  novel kinesin motor domain containing protein [Danio rerio], Length=690?
see also GLEAN3_13729 and GLEAN3_10770 also KIFC3-like.
Annotations by RA Obar, RL Morris, BA Jeffrey, and B Rossetti.
###Gene_Info_Comments GLEAN3_11200 ###
Strongylocentrotus purpuratus similar to F-box only protein 28 (LOC574995)
###Gene_Info_Comments GLEAN3_04722 ###
Strongylocentrotus purpuratus similar to WD-repeat 
protein 26 (LOC579141), mRNA

###Gene_Info_Comments GLEAN3_06169 ###
e val = 2e-160 for XP_780214 "PREDICTED: similar to breast cancer metastasis-suppressor 1-like [Strongylocentrotus purpuratus]"
e val = e-55 for XP_789383 "PREDICTED: similar to kinesin-like motor protein C20orf23 [Strongylocentrotus purpuratus]"
e val = e-32 for NP_115728, breast cancer metastasis-suppressor 1-like [Homo sapiens]
Annotation by R.A.Obar and R.L. Morris 020106
###Gene_Info_Comments GLEAN3_00882 ###
Very similar to GLEAN3_01683.
###Gene_Info_Comments GLEAN3_01683 ###
Very similar to GLEAN3_00882 and GLEAN3_13850; may be a duplication or haplotype of the latter.
###Gene_Info_Comments GLEAN3_09653 ###
Very similar to GLEAN3_17523, may be a duplication or haplotype.  Also significant similarity to C-terminus of GLEAN3_11471.
###Gene_Info_Comments GLEAN3_13850 ###
Very similar to GLEAN3_01683; may be a duplication or haplotype.
###Gene_Info_Comments GLEAN3_21561 ###
Very high similarity to C-terminal sequences of GLEAN3_09497, GLEAN3_11339, GLEAN3_26645, GLEAN3_22941, and GLEAN3_26743.  Also has significant sequence similarity to parts of GLEAN3_01472, GLEAN3_09653, and GLEAN3_17523
###Gene_Info_Comments GLEAN3_11916 ###
This GLEAN MAY be similar to the human nuclear receptor coactivator 5 (NCOA5).
###Gene_Info_Comments GLEAN3_26645 ###
Very high similarity to C-terminal sequences of GLEAN3_09497, GLEAN3_11339, GLEAN3_21561, GLEAN3_22941, and GLEAN3_26743.  Also has significant sequence similarity to parts of GLEAN3_01472, GLEAN3_09653, and GLEAN3_17523
###Gene_Info_Comments GLEAN3_22941 ###
Very similar to GLEAN3_09497, GLEAN3_11339, GLEAN3_21561, GLEAN3_26645, and GLEAN3_26743.  Also has significant sequence similarity to parts of GLEAN3_01472, GLEAN3_09653, and GLEAN3_17523.  Missing N-terminus (no methionine).
###Gene_Info_Comments GLEAN3_00205 ###
Contains MMR_HSR1 domain (GTPase of unknown function domain)
###Gene_Info_Comments GLEAN3_01505 ###
Contains MMR_HSR1 domain (GTPase of unknown function domain)
###Gene_Info_Comments GLEAN3_04604 ###
Contains MMR_HSR1 domain (GTPase of unknown function domain)
###Gene_Info_Comments GLEAN3_05393 ###
Contains MMR_HSR1 domain (GTPase of unknown function domain)
###Gene_Info_Comments GLEAN3_26743 ###
Very similar to C-terminal sequences of GLEAN3_09497, GLEAN3_11339, GLEAN3_21561, GLEAN3_22941, and GLEAN3_26645.  Also has significant sequence similarity to parts of GLEAN3_19865, GLEAN3_10466, and GLEAN3_02718....
###Gene_Info_Comments GLEAN3_09555 ###
Contains MMR_HSR1 domain (GTPase of unknown function domain)
###Gene_Info_Comments GLEAN3_10114 ###
Contains MMR_HSR1 domain (GTPase of unknown function domain)
###Gene_Info_Comments GLEAN3_13859 ###
Contains MMR_HSR1 domain (GTPase of unknown function domain)
###Gene_Info_Comments GLEAN3_14831 ###
Contains MMR_HSR1 domain (GTPase of unknown function domain)
###Gene_Info_Comments GLEAN3_16886 ###
Contains MMR_HSR1 domain (GTPase of unknown function domain)
###Gene_Info_Comments GLEAN3_17772 ###
Contains MMR_HSR1 domain (GTPase of unknown function domain)
Note: Identical to GLEAN3_16886 except missing 3'end
###Gene_Info_Comments GLEAN3_20224 ###
Contains MMR_HSR1 domain (GTPase of unknown function domain)
###Gene_Info_Comments GLEAN3_20366 ###
Contains MMR_HSR1 domain (GTPase of unknown function domain)
###Gene_Info_Comments GLEAN3_21394 ###
Contains MMR_HSR1 domain (GTPase of unknown function domain)
###Gene_Info_Comments GLEAN3_24411 ###
Contains MMR_HSR1 domain (GTPase of unknown function domain)
###Gene_Info_Comments GLEAN3_25797 ###
Contains MMR_HSR1 domain (GTPase of unknown function domain)
###Gene_Info_Comments GLEAN3_02178 ###
Model contains two adjacent astacin protease domains followed by an EGF domain.  This architecture is unique among members of this groups of metalloproteases.
###Gene_Info_Comments GLEAN3_00238 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_00366 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_01565 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_01759 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_02067 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_02683 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_02822 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_02867 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_04191 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_04693 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_05923 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_06695 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_07047 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_08044 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_09492 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_12220 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_12387 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_12571 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_13691 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_14169 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_15602 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_17595 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_18058 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_18349 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_18424 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_19037 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_19246 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_20433 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_20553 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_23417 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_23579 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_23581 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_23748 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_24254 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_25387 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_26420 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_26642 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_26749 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_26754 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_27950 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_28440 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_28648 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_00445 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_00465 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_00466 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_00609 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_00833 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_00894 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_01635 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_01693 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_01908 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_02377 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_02503 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_02577 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_02621 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_02721 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_02805 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_03694 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_03790 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_04339 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_05077 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_05245 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_05535 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_05934 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_06734 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_07281 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_08232 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_08425 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_08451 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_08765 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_09051 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_10863 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_11373 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_12033 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_12737 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_14033 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_14153 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_14506 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_14870 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_14988 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_15532 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_15646 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_15686 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_16297 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_16809 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_16905 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_16951 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_18034 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_18161 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_18700 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_18896 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_19122 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_19321 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_19619 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_19867 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_20189 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_20285 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_20438 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_20439 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_20521 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_21756 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_22660 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_23533 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_23891 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_23940 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_24477 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_25050 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_25461 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_25562 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_27043 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_28654 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_28929 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_00093 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_00306 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_00467 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_00846 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_01111 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_01234 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_01375 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_01553 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_01637 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_01661 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_01961 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_02318 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_02354 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_02362 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_02502 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_03081 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_04281 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_04696 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_04847 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_05943 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_06062 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_06184 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_06480 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_06827 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_07002 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_07525 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_08092 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_08706 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_08729 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_08925 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_09102 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_09779 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_10037 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_10166 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_10251 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_10558 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_10732 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_11590 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_11627 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_11788 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_12030 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_12346 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_12718 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_12796 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_12836 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_12916 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_13518 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_13548 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_14356 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_14675 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_14825 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_15193 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_15236 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_16072 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_16281 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_16620 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_16985 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_17565 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_17576 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_18089 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_18398 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_18699 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_19093 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_19342 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_19660 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_19832 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_20352 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_20583 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_21066 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_21094 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_21349 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_21730 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_21848 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_22453 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_22688 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_22751 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_23418 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_23419 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_23580 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_23699 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_23857 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_24550 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_24562 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_25109 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_26158 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_26519 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_26812 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_26911 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_27184 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_27190 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_27569 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_28350 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_28515 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_28657 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_28815 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_00307 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_00464 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_00474 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_00892 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_01021 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_02486 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_02772 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_02828 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_02905 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_03454 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_03576 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_03611 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_04020 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_05019 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_05645 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_05897 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_06563 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_07637 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_07666 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_07835 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_08980 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_08993 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_09181 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_11743 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_12142 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_13389 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_13690 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_14085 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_14175 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_15261 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_15560 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_15871 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_16038 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_16047 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_16236 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_16420 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_17399 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_17732 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_18526 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_18560 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_18569 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_19977 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_20103 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_20800 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_20913 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_22170 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_22188 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_23994 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_25103 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_25148 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_25417 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_25463 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_25602 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_25804 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_26462 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_26670 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_27750 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_28062 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_28816 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_00685 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_00768 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_01509 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_02089 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_02504 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_02699 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_02794 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_04941 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_04970 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_05173 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_05210 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_05424 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_05591 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_05704 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_05933 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_06061 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_06205 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_07032 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_07314 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_07489 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_07705 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_07762 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_07953 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_08055 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_08489 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_09064 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_09329 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_09565 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_09569 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_11273 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_11680 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_11907 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_12107 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_12144 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_12231 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_12331 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_12802 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_13496 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_14872 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_14878 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_14893 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_14933 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_15147 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_15166 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_15201 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_15645 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_15663 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_15694 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_15911 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_15956 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_16469 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_17435 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_17482 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_17997 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_18670 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_19218 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_19474 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_19847 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_20135 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_20454 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_20503 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_20862 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_21741 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_22016 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_22963 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_23075 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_23698 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_24131 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_25530 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_25880 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_26174 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_26434 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_26542 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_27036 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_27210 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_27269 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_27412 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_27474 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_27801 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_28399 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_28656 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_28658 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_28932 ###
Portion of the early histone gene repeat
###Gene_Info_Comments GLEAN3_01465 ###
Model contains exons encoding cub repeats that are nearly identical to glean3_08802.  it probably is a partial CDS of an allele in 08802 or another closely related gene.
###Gene_Info_Comments GLEAN3_11658 ###
The predicted ORF has a N-terminal sequence longuer than  homologous cyclin H in other species. The first Met of these Cyclin H is conserved in Sp raising the possibility that the      initiation codon predicted in the features is perhaps not the   true one.
###Gene_Info_Comments GLEAN3_00328 ###
Three GLEAN: GLEAN3_00328,14989 and 0011295 encode the cyclin L protein. They differ in the N-terminal end.
###Gene_Info_Comments GLEAN3_20986 ###
Unknown protein containing a Cyclin domain
###Gene_Info_Comments GLEAN3_11295 ###
Potential N-terminal sequence of Sp-Cyclin L found in GLEAN3_14989
Three GLEAN: GLEAN3_00328,14989 and 0011295 encode the cyclin L protein. They differ in the N-terminal end.
###Gene_Info_Comments GLEAN3_11190 ###
This gene was annotated and modified based on bioinformatic evidence (analysis of multiple protein sequence alignments and domain structures).

The original version of GLEAN3_11190 showed a domain composition/structure very similar but not identical to that of vertebrate and Drosophila Stam genes. Inspection of other predictions revealed that an otherwise almost identical Genscan model incorporates an additional exon (supported by noticeable signal from the genome-wide tiling array hybridization data). When translated, this Genscan model showed an improved alignment to Stam genes and a domain structure now identical to that of vertebrate and fuit fly Stams. We have therefore decided to modify GLEAN3_11190 accordingly.
###Gene_Info_Comments GLEAN3_28525 ###
This gene may represent a partial duplication of GLEAN3_11190. It is located at the end of a relatively small scaffold, and their sequence identity is 99% at the aminoacid level and >94% at the nucleotide level (including intronic and flanking sequences from the contigs where both models map), which suggests they may reflect an assembly error.
###Gene_Info_Comments GLEAN3_08053 ###
This Glean sequence correspond to an exact duplication of the N-terminal region of Sp-Faim (GLEAN3_03262).
###Gene_Info_Comments GLEAN3_19983 ###
>GLEAN3_19983|Scaffold499|161923|162036| 
>GLEAN3_19983|Scaffold499|162415|162741| 
contain sequences conserved in DAN proteins.
###Gene_Info_Comments GLEAN3_03281 ###
Partial sequence. 
###Gene_Info_Comments GLEAN3_05441 ###
Partial sequence longer than its duplicate GLEAN3_03281 but still partial. N.B.: the two duplicates are not identical
###Gene_Info_Comments GLEAN3_01228 ###
Unknown CYP. Fragmentary. Last 3 or 4 exons of a P450 with insufficient homology to known proteins to identify. First exon may not be good.
###Gene_Info_Comments GLEAN3_02899 ###
Partial CYP2-like gene. Near GLEAN3_02898, Sp-Cyp2-like8, suggesting possible tandem duplication as is know for other CYP2s in many species.
###Gene_Info_Comments GLEAN3_01773 ###
Only the N-terminal region of the Query sequence is covered by the GLEAN3 model.

The first 42 residues of GLEAN3_01773 are unique whereas the remainder of the sequence is identical to and contained within GLEAN3_01777. 
###Gene_Info_Comments GLEAN3_18723 ###
Possible exta exon in the GLEAN3 model, length extended relative to the Query sequence used.

GLEAN3_18723 is near exact match to GLEAN3_28113 with the exception of an extended N-terminal region.
###Gene_Info_Comments GLEAN3_18795 ###
Posible missing exon middle region, length reduced relative to the Query sequence used.
###Gene_Info_Comments GLEAN3_21197 ###
Possible missing exon C-terminal region.  The GLEAN3 model only covers the N-terminal region of the Query sequence used.
GLEAN3_21197 is contained within GLEAN3_21198.
###Gene_Info_Comments GLEAN3_21198 ###
GLEAN3_21198 contains GLEAN3_21197.
###Gene_Info_Comments GLEAN3_28113 ###
Posible exon duplication.  The GLEAN3 model may have a duplication of the C-terminal region revealed by the alignment with the Query sequence used.
GLEAN3_28113 is a near exact alignment to GLEAN3_18723 and is contained within it.
###Gene_Info_Comments GLEAN3_01777 ###
Only the N-terminal region of the Query sequence is covered by the GLEAN3 model.

GLEAN3_01773 is contained within GLEAN3_01777.
###Gene_Info_Comments GLEAN3_04494 ###
A clear sequence match to Msh5 but with low coverage of the Query sequence.
###Gene_Info_Comments GLEAN3_11199 ###
GLEAN3_11199 contains GLEAN3_21406.
###Gene_Info_Comments GLEAN3_21406 ###
GLEAN3_21406 is a fragment of GLEAN3_11199.
###Gene_Info_Comments GLEAN3_18944 ###
GLEAN3_18944 contains an extended c-terminus of low complexity sequence relative to the query sequence used.
###Gene_Info_Comments GLEAN3_07033 ###
Fragment, missing C terminus due to incomplete scaffold
###Gene_Info_Comments GLEAN3_28358 ###
Fragment, missing C terminus, possibly other exons due to incomplete scaffolds
###Gene_Info_Comments GLEAN3_03760 ###
Allele: GLEAN3_03908
###Gene_Info_Comments GLEAN3_03908 ###
Allele: GLEAN3_03760
###Gene_Info_Comments GLEAN3_00064 ###
gi|68420855|ref|XP_700381.1|  PREDICTED: similar to Muscarinic acetylcholine receptor M3, partial 
[Danio rerio]
###Gene_Info_Comments GLEAN3_00078 ###
G-protein coupled receptor 88
###Gene_Info_Comments GLEAN3_06016 ###
The first 60 aa of this glean number (KDIGRRLGLLEADLENIESDYPKQKERGYQMLLKWRQMTRNKDLVKTLVQGLQSVQRVDLADKYGPRFEALFPSEIESD)
presents homology with the death domains of proteins from TNFR family

However the rest of the sequence is more closely related to NOD/NALP proteins although the four last exons encode a sterol-desaturase domain which normally does not belong to this type of molecules. 

Assembly problem must had occurred during the generation of this sequence
###Gene_Info_Comments GLEAN3_00283 ###
NB: sequence identical to GLEAN3_07382
###Gene_Info_Comments GLEAN3_07382 ###
NB: sequence identical to GLEAN3_00283
###Gene_Info_Comments GLEAN3_23408 ###
end of Nek10 sequence. See GLEAN3_18375 from complete gene features
###Gene_Info_Comments GLEAN3_18440 ###
GLEAN3_18441 predicts the first half of SND1 and GLEAN3_18440 has the rest of the gene.
###Gene_Info_Comments GLEAN3_18441 ###
GLEAN3_18441 predicts the first half of SND1 and GLEAN3_18440 has the rest of the gene.
###Gene_Info_Comments GLEAN3_26759 ###
GLEAN3_02501 is a partial duplicate prediction.
###Gene_Info_Comments GLEAN3_02501 ###
Partial duplicate prediction for GLEAN3_26759
###Gene_Info_Comments GLEAN3_02533 ###
This prediction is likely incorrect. There are at least two separate genes present in this GLEAN. Later half of the prediction matches the human MELK gene well. See the alignment.
###Gene_Info_Comments GLEAN3_07194 ###
GLEAN3_07194, GLEAN3_24350 and GLEAN3_27411 are tudor domain containing proteins with weak homology to human tudor domain containing protein 1 (TDRD1). TDRD1 ortholog in urchin is represented by GLEAN3_17916. These may be novel tudor domain proteins or may be incorrect predictions.
###Gene_Info_Comments GLEAN3_24350 ###
GLEAN3_07194, GLEAN3_24350 and GLEAN3_27411 are tudor domain containing proteins with weak homology to human tudor domain containing protein 1 (TDRD1). TDRD1 ortholog in urchin is represented by GLEAN3_17916. These may be novel tudor domain proteins or may be incorrect predictions.
###Gene_Info_Comments GLEAN3_27411 ###
GLEAN3_07194, GLEAN3_24350 and GLEAN3_27411 are tudor domain containing proteins with weak homology to human tudor domain containing protein 1 (TDRD1). TDRD1 ortholog in urchin is represented by GLEAN3_17916. These may be novel tudor domain proteins or may be incorrect predictions.
###Gene_Info_Comments GLEAN3_11603 ###
GLEAN3_11603 covers to 700 of 1087 residues in the Query.
###Gene_Info_Comments GLEAN3_12136 ###
GLEAN3_12136 coverage limited to first 432 of 615 residues in the Query sequence used.
###Gene_Info_Comments GLEAN3_12137 ###
GLEAN3_12137 coverage is limited to residues 441 (M) to 606 of 614 residue sequence used as a Query.
###Gene_Info_Comments GLEAN3_19559 ###
GLEAN3_19559 contains Uba Domain N-terminal but by sequence match it is as named.
###Gene_Info_Comments GLEAN3_22259 ###
GLEAN3_22259 missed 55 residues relative to Query.
###Gene_Info_Comments GLEAN3_25405 ###
Possible missing upstream exon relative to Query.
###Gene_Info_Comments GLEAN3_00486 ###
GLEAN3_00486 coverage of the Query is 267-477 of 608 aa protein sequence.  Additional regions of the Query are present on scaffold 26695.  Add exons to the features table.
###Gene_Info_Comments GLEAN3_15411 ###
GLEAN3_25651 has the first part of the SKIV2L gene and GLEAN3_15411 has the latter half.
###Gene_Info_Comments GLEAN3_25651 ###
GLEAN3_25651 has the first part of the SKIV2L gene and GLEAN3_15411 has the latter half.
###Gene_Info_Comments GLEAN3_18262 ###
GLEAN3_18262 sequence is close but not identical to GLEAN3_28109 sequence.
###Gene_Info_Comments GLEAN3_11552 ###
This prediction should be combined with Glean3_11551.  It contains exons encoding cub domains that are very likely to complete the C-terminal sequence of the astacin protease in 11551.
###Gene_Info_Comments GLEAN3_07658 ###
From Pfam 19.0

Accession number: PF02301
HORMA domain

The HORMA (for Hop1p, Rev7p and MAD2) domain has been suggested to recognise chromatin states that result from DNA adducts, double stranded breaks or non-attachment to the spindle and acts as an adaptor that recruits other proteins. MAD2 is a spindle checkpoint protein which prevents progression of the cell cycle upon detection of a defect in mitotic spindle integrity. 

###Gene_Info_Comments GLEAN3_16042 ###
Possible missing N-terminal coding exon relative to Query used.
###Gene_Info_Comments GLEAN3_08287 ###
Blasts as myotubularin-related protein 2 isoform 2.
###Gene_Info_Comments GLEAN3_09486 ###
e val = 2e-67 for NP_005724
Almost exact match to XP_790534 : PREDICTED: similar to kinesin family member 20A, partial [Strongylocentrotus purpuratus] 
1197 nt spread out over 8 or 9 exons.
Exon 8 may represent a false prediction of an exon or may include sequence errors.  All other exons were perfect matches to accession #  XP_790534.  With a 1 nucleotide shift, exon 8 is approximately 80% identical between the described GLEAN3_09486 exon 8 and XP_790534.
Same sequence is found on Scaffoldi2484 from sp_20060316.asm.
Annotation by RA Obar, RL Morris, J Bhatia, BA Jeffrey, AM Musante, EJ Jin, BJ Rossetti and AP Rawson
###Gene_Info_Comments GLEAN3_01874 ###
When blasted with mus, homo sapiens gene did not obtain same glean3 hit.
###Gene_Info_Comments GLEAN3_22840 ###
e val for NP_524883=3e-100.
e val for NP_612433=1e-63; kinesin family member 12 [Homo sapiens].  
Similarity to NP_612433 is based on overlap of C terminal half of Glean3_22840 with N terminus of 612433.  612433 contains only partial kinesin motor domain when compared with human CENP-E. 
Annotation by RA Obar, RL Morris, SA Tower, SC Cummings, EA Kovacs, and AP Rawson.  
###Gene_Info_Comments GLEAN3_11918 ###
GLEAN3_25220 also has very good alignment.
###Gene_Info_Comments GLEAN3_07768 ###
From SwissPro entry -
"Interacts with BIRC4/XIAP. These two proteins are likely to coexist in a complex with TAK1, TRAF6, TAB1 and TAB2 (By similarity)." 

###Gene_Info_Comments GLEAN3_21537 ###
myotubularin-related protein 9
###Gene_Info_Comments GLEAN3_25276 ###
myotubularin related protein 12
No myotubularin domain in this protein.
###Gene_Info_Comments GLEAN3_14336 ###
From Swiss Pro 
"May function as a ubiquitin-protein or polyubiquitin hydrolase. This deubiquitinating enzyme which functions at the endosome, is able to oppose the ubiquitin-dependent sorting of receptors to lysosomes (By similarity)." 

###Gene_Info_Comments GLEAN3_26259 ###
May be PTPR10D. Partial sequence.  Contains PTP catalytic domain.
###Gene_Info_Comments GLEAN3_22005 ###
From Swiss Pro
"Probable component of the SCF (SKP1-CUL1-F-box protein) E3 ubiquitin ligase complex which mediates the ubiquitination and subsequent proteasomal degradation of target proteins involved in cell cycle progression, signal transduction and transcription. Through the RING-type zinc finger, seems to recruit the E2 ubiquitination enzyme to the complex and brings it into close proximity to the substrate. May play a role in protecting cells from apoptosis induced by redox agents. "

###Gene_Info_Comments GLEAN3_26336 ###
GLEAN3_26336 likely codes for part 1 of the DHX34 gene.
GLEAN3_27857 likely codes for part 2 of the DHX34 gene.
GLEAN3_11079 likely codes for part 3 of the DHX34 gene.
###Gene_Info_Comments GLEAN3_27857 ###
GLEAN3_26336 likely codes for part 1 of the DHX34 gene.
GLEAN3_27857 likely codes for part 2 of the DHX34 gene.
GLEAN3_11079 likely codes for part 3 of the DHX34 gene.
###Gene_Info_Comments GLEAN3_11079 ###
GLEAN3_26336 likely codes for part 1 of the DHX34 gene.
GLEAN3_27857 likely codes for part 2 of the DHX34 gene.
GLEAN3_11079 likely codes for part 3 of the DHX34 gene.
###Gene_Info_Comments GLEAN3_10141 ###
Same as GLEAN3_00897.
###Gene_Info_Comments GLEAN3_00897 ###
Same as GLEAN3_10141.
###Gene_Info_Comments GLEAN3_12067 ###
Assemble fragments to obtain Query coverage
GLEAN3_11485 N-terminal coverage
GLEAN3_27393 also N-terminal from alternate region or query
GLEAN3_12067 middle region of query covered
GLEAN3_18381 C-terminus

###Gene_Info_Comments GLEAN3_09949 ###
GLEAN3_09949 is a fragment of GLEAN3_00595.
###Gene_Info_Comments GLEAN3_11022 ###
Comparison to best blast hit suggests that the gene model lacks both N- and C- terminal sequences.  Note: the best blast hit encodes a huge protein more than 2800 amino acids.
###Gene_Info_Comments GLEAN3_01344 ###
Partial sequence.
###Gene_Info_Comments GLEAN3_17796 ###
multiple ankyrin repeats in the encoded protein
###Gene_Info_Comments GLEAN3_17839 ###
the encoded protein has several ankyrin repeats
###Gene_Info_Comments GLEAN3_01143 ###
the encoded protein has several ankyrin repeats
###Gene_Info_Comments GLEAN3_15973 ###
the encoded protein has several ankyrin repeats
###Gene_Info_Comments GLEAN3_01707 ###
the encoded protein has several ankyrin repeats
###Gene_Info_Comments GLEAN3_15269 ###
the encoded protein has several ankyrin repeats
###Gene_Info_Comments GLEAN3_19767 ###
the encoded protein has several ankyrin repeats
###Gene_Info_Comments GLEAN3_22497 ###
the encoded protein has several ankyrin repeats
###Gene_Info_Comments GLEAN3_10321 ###
the encoded protein has several ankyrin repeats
###Gene_Info_Comments GLEAN3_15601 ###
the encoded protein has several ankyrin repeats
###Gene_Info_Comments GLEAN3_04926 ###
Partial sequence of DUSP4
###Gene_Info_Comments GLEAN3_09401 ###
Sequence spans collagen and head domains.  Profile scan using ScanProsite identified C1q profile from residues 133 to 269 (score 28.379).
###Gene_Info_Comments GLEAN3_21191 ###
May have an extra exon towards the end of the prediction.
###Gene_Info_Comments GLEAN3_00282 ###
Possible missing c-terminual coding exon relative to query.
###Gene_Info_Comments GLEAN3_07253 ###
predicted:similar to sterol regulatory element binding protein
###Gene_Info_Comments GLEAN3_17267 ###
predicted: similar to  mucin19 in S.purp
###Gene_Info_Comments GLEAN3_14795 ###
PREDICTED: similar to TRPC4-associated protein isoform b
(transient receptor potential cation channel)
###Gene_Info_Comments GLEAN3_23016 ###
PREDICTED: similar to Fras1 related extracellular matrix protein 
1
###Gene_Info_Comments GLEAN3_23856 ###

The AF-6 protein (also known as afadin and canoe) is a multidomain cell junction protein that contains two N-terminal Ras-associating (RA) domains in addition to FHA (forkhead-associated), DIL (class V myosin homology region), and PDZ domains and a proline-rich region
###Gene_Info_Comments GLEAN3_09389 ###
Hypothetical protein similar to CG15216-PA gene model 
No other info known
###Gene_Info_Comments GLEAN3_25634 ###
glean feature incomplete, missing exons predicted by genesh. 
###Gene_Info_Comments GLEAN3_09107 ###
it is homolog of Mus musculus RIKEN cDNA 2310067G05 gene, function is not known. If you know the gene function, you can rename it and take it.
###Gene_Info_Comments GLEAN3_12221 ###
There are 6 to 8 copies of this protein in sea urchin genome based on blast search.
###Gene_Info_Comments GLEAN3_22632 ###
The protein is shorter than its mouse homolog.
###Gene_Info_Comments GLEAN3_05033 ###
There are about 7 more copies of Calm1 in sea urchin genome. GLEAN3_05032, GLEAN3_05033, GLEAN3_21511, GLEAN3_10425, GLEAN3_08000, GLEAN3_17814, GLEAN3_24195.

###Gene_Info_Comments GLEAN3_18898 ###
This sequence was published in: Wedaman, K.P., Knight, A.E., Kendrick-Jones ,J. and Scholey, J.M. "Sequences of sea urchin kinesin light chain isoforms."  J. Mol. Biol. 231 (1), 155-158 (1993).
There are 4 known spliceoforms (mRNAs) encoded by this gene, "kinesin light chain isoform 1" through "isoform 4."  The one predicted in GLEAN3_18898 has been named  "kinesin light chain isoform 4" (KLC-4).
###Gene_Info_Comments GLEAN3_23443 ###
VERY LARGE ADHESION PROTEIN - MAY BE A CONCATENATION

LDLa x5 CCP2-EGFCa x3-NIDO-aAMOP-VWD-CA -EGF-many EGFCA-FOLN-TM

Novel architecture
###Gene_Info_Comments GLEAN3_21797 ###
GLEAN3_01260 is a partial duplicate prediction for GLEAN3_21797.
###Gene_Info_Comments GLEAN3_13905 ###
GLEAN3_08049 and GLEAN3_13905 are duplicate predictions. 
###Gene_Info_Comments GLEAN3_08049 ###
GLEAN3_08049 and GLEAN3_ 13905 are duplicate predictions. 
###Gene_Info_Comments GLEAN3_13521 ###
GLEAN3_13521 is a partial duplicate prediction for GLEAN3_10382 (First 180 AA from both GLEAN's). Rest of 13521 does not appear to be similar to any protein in database.
###Gene_Info_Comments GLEAN3_10338 ###
This prediction is most similar to DDX43, though it could be a different DDX protein. It certainly is like a DDX protein in any case.
###Gene_Info_Comments GLEAN3_16062 ###
This GLEAN is almost certainly an incorrect version of the gene represented by GLEAN3_16061, which is a Tubulin binding cofactor A (TBCA) homolog.  These two (adjacent) GLEANs differ only in their predicted amino-termini, but the predicted amino-terminus of GLEAN3_16061 matches the rest of the proteins in the TBCA family well, while the predicted amino-terminus of GLEAN3_16062 does not.
###Gene_Info_Comments GLEAN3_01260 ###
GLEAN3_01260 is a partial duplicate prediction for GLEAN3_21797.
###Gene_Info_Comments GLEAN3_10463 ###
This gene is in three GLEAN's. GLEAN3_10463 has part 1 (to ~570 AA), GLEAN3_24100 has part 2 (from ~ 200-1246 AA) and GLEAN3_24101 has part 3 (~ 1028-1304 AA). AA numbers refer to human protein. In addition, GLEAN3_10463 prediction overlaps GLEAN3_24100 (~200-600 AA from 10463 ovelap with 1-396 AA from 24100). 
###Gene_Info_Comments GLEAN3_24100 ###
This gene is in three GLEAN's. GLEAN3_10463 has part 1 (to ~570 AA), GLEAN3_24100 has part 2 (from ~ 200-1246 AA) and GLEAN3_24101 has part 3 (~ 1028-1304 AA). AA numbers refer to human protein. In addition, GLEAN3_10463 prediction overlaps GLEAN3_24100 (~200-600 AA from 10463 ovelap with 1-396 AA from 24100). 
###Gene_Info_Comments GLEAN3_24101 ###
This gene is in three GLEAN's. GLEAN3_10463 has part 1 (to ~570 AA), GLEAN3_24100 has part 2 (from ~ 200-1246 AA) and GLEAN3_24101 has part 3 (~ 1028-1304 AA). AA numbers refer to human protein. In addition, GLEAN3_10463 prediction overlaps GLEAN3_24100 (~200-600 AA from 10463 ovelap with 1-396 AA from 24100). 
###Gene_Info_Comments GLEAN3_01832 ###
Even though this is only a partial prediction, it precisely matches human SF3B14 protein. The rest of the human SF3B14 protein is not represented by other GLEANS's.
WGS clone SPWDP1E744370A may have the mussing 50 AA at end.
###Gene_Info_Comments GLEAN3_28676 ###
1 missing Kelch repeat (only 5 of 6 expected relative to query) could indicate incomplete model near c-terminus.
###Gene_Info_Comments GLEAN3_21088 ###
Likely missing initial ~200 AA as compared to the human protein.

GLEAN3_04163 is a duplicate prediction for GLEAN3_21088.
###Gene_Info_Comments GLEAN3_15453 ###
GLEAN3_15453 is a partial duplicate prediction for GLEAN3_05681 and MAY represent a better model for the latter half of this gene.
###Gene_Info_Comments GLEAN3_05681 ###
GLEAN3_15453 is a partial duplicate prediction for GLEAN3_05681 and MAY represent a better model for the latter half of this gene.
###Gene_Info_Comments GLEAN3_12779 ###
GLEAN3_09634 is a duplicate prediction. May be missing first exon.
###Gene_Info_Comments GLEAN3_09634 ###
GLEAN3_09634 is a duplicate prediction for GLEAN3_12779.
###Gene_Info_Comments GLEAN3_08095 ###
GLEAN3_05311 is a duplicate prediction for GLEAN3_08095.
###Gene_Info_Comments GLEAN3_18805 ###
From Swiss Prot entry
"FUNCTION: Substrate-recognition component of the SCF (SKP1-CUL1-F-box protein)-type E3 ubiquitin ligase complex (By similarity)." 
###Gene_Info_Comments GLEAN3_10835 ###
Possibly missing the first exon.
###Gene_Info_Comments GLEAN3_05311 ###
GLEAN3_05311 is a duplicate prediction for GLEAN3_08095.
###Gene_Info_Comments GLEAN3_20121 ###
GLEAN3_20121 has the first part of the gene. GLEAN3_14430 has the rest of the gene.
###Gene_Info_Comments GLEAN3_00801 ###
GLEAN3_23248 has first part of the gene. GLEAN3_00801 has the rest.
###Gene_Info_Comments GLEAN3_23248 ###
GLEAN3_23248 has first part of the gene. GLEAN3_00801 has the rest.
###Gene_Info_Comments GLEAN3_24643 ###
GLEAN3_24643 is a partial duplicate prediction for GLEAN3_24644.
###Gene_Info_Comments GLEAN3_24644 ###
GLEAN3_24643 is a partial duplicate prediction for GLEAN3_24644.
###Gene_Info_Comments GLEAN3_24273 ###
Phylogenetic analysis shows that this glean model is in a clade with human Niemann-Pick C1 which is a patched related protein.
###Gene_Info_Comments GLEAN3_28882 ###
Phylogenetic analysis shows that this glean is highly similar to Glean3_24273.  They both are in a clade with human Niemann Pick C1.
###Gene_Info_Comments GLEAN3_27985 ###
GLEAN3_26660 has the same hit.
###Gene_Info_Comments GLEAN3_24492 ###
GLEAN3_22249 has the same hit.
###Gene_Info_Comments GLEAN3_17600 ###
May be missing an exon at the beginning and end.
###Gene_Info_Comments GLEAN3_00170 ###
GLEAN3_00170 has the first part of the gene and GLEAN3_00171 has the rest.
###Gene_Info_Comments GLEAN3_02448 ###
same as glean3_13119.
###Gene_Info_Comments GLEAN3_23519 ###
GLEAN3_23520 is a partial duplicate prediction of GLEAN3_23519.
###Gene_Info_Comments GLEAN3_23520 ###
GLEAN3_23520 is a partial duplicate prediction of GLEAN3_23519.
###Gene_Info_Comments GLEAN3_00171 ###
GLEAN3_00170 has the first part of the gene and GLEAN3_00171 has the rest.
###Gene_Info_Comments GLEAN3_07870 ###
This GLEAN MAY represent the RANBP2 ortholog in Urchin. RANBP2 in humans encodes a very large RAN-binding protein that immunolocalizes to the nuclear pore complex.
###Gene_Info_Comments Sp-185/333-01 ###
A partial gene on the end of the scaffold that includes the leader, and the start of the open reading frame (Elements 1-2).  
###Gene_Info_Comments Sp-185/333-02 ###
Scaffold65222 is entirely 185/333 sequence, but the scaffold starts in the intron, so there is no start codon.  The sequence is element pattern most likely C4, although the scaffold sequence ends just before the putative stop codon.
###Gene_Info_Comments Sp-185/333-03 ###
Partial gene: Leader, Intron, elements 1-3.
###Gene_Info_Comments GLEAN3_02645 ###
This GLEAN MAY code for SRPK1.
###Gene_Info_Comments GLEAN3_27906 ###
Portion of derived peptide sequence matches c-lectin domain (smart00034-E value=3e-04).

Expressed in PMC est libraries.  On same scaffold as PM27.
###Gene_Info_Comments glean3_28945 ###
Partial gene--continues off the beginning of Scaffold1445.  Did not appear in the original glean3_XXXXX models.
###Gene_Info_Comments GLEAN3_22984 ###
GLEAN3_22984 and GLEAN3_24392 are both likely candidates for SFRS8. They internally have a significant overlap as well.
###Gene_Info_Comments GLEAN3_24392 ###
GLEAN3_22984 and GLEAN3_24392 are both likely candidates for SFRS8. They internally have a significant overlap as well.
###Gene_Info_Comments GLEAN3_17377 ###
GLEAN3_17377 is a duplicate prediction for GLEAN3_22855.
###Gene_Info_Comments GLEAN3_22855 ###
GLEAN3_17377 is a duplicate prediction for GLEAN3_22855.
###Gene_Info_Comments GLEAN3_10081 ###
e val = e -136 for NP_878906.
This peptide is identical in length (476aas) and sequence to Glean3_00875 on scaffold 113994.
Annotated by RA Obar, BD Dyer, RL Morris.
###Gene_Info_Comments GLEAN3_25021 ###
Possible duplicated gene, GLEAN3_27654
###Gene_Info_Comments GLEAN3_27654 ###
Possible duplicated gene, GLEAN3_25021
###Gene_Info_Comments GLEAN3_01555 ###
Possible assemble error, GLEAN3_08070 maybe belongs to 3' of this gene
###Gene_Info_Comments GLEAN3_08070 ###
Possible assemble error, GLEAN3_01555 maybe belongs to 5' of this gene
###Gene_Info_Comments GLEAN3_13727 ###
GLEAN3_13727 has first part of the LSM14A gene and GLEAN3_13728 has the latter half.
###Gene_Info_Comments GLEAN3_13728 ###
GLEAN3_13727 has first part of the LSM14A gene and GLEAN3_13728 has the latter half.
###Gene_Info_Comments GLEAN3_25049 ###
GLEAN3_16338 is a duplicate prediction for GLEAN3_25409
###Gene_Info_Comments GLEAN3_16338 ###
GLEAN3_16338 is a duplicate prediction for GLEAN3_25409
###Gene_Info_Comments GLEAN3_25266 ###
Missing N-terminus.  N-terminus is GLEAN3_28560.  
###Gene_Info_Comments GLEAN3_28560 ###
missing C-terminus.  C-terminus is predicted in GLEAN3_25266.  Center doamin is overlapped.  
###Gene_Info_Comments GLEAN3_20437 ###
GLEAN3_03537 has first part of USP52 and GLEAN3_20437 has the rest.
###Gene_Info_Comments GLEAN3_03537 ###
GLEAN3_03537 has first part of USP52 and GLEAN3_20437 has the rest.
###Gene_Info_Comments GLEAN3_08448 ###
This GLEAN3 prediction is likely to be incorrect. 
###Gene_Info_Comments GLEAN3_28580 ###
GLEAN3_28452 is a duplicate prediction for GLEAN3_28580.
###Gene_Info_Comments GLEAN3_28452 ###
GLEAN3_28452 is a duplicate prediction for GLEAN3_28580.
###Gene_Info_Comments GLEAN3_27408 ###
GLEAN3_27408 has the first part and GLEAN3_16447 has the rest of the EXOSC10 gene. In addition, GLEAN3_27408 and GLEAN3_16447 share a significant partially identical overlap.
###Gene_Info_Comments GLEAN3_16447 ###
GLEAN3_27408 has the first part and GLEAN3_16447 has the rest of the EXOSC10 gene. In addition, GLEAN3_27408 and GLEAN3_16447 share a significant partially identical overlap.
###Gene_Info_Comments GLEAN3_24074 ###
2 different genes.  One portion is similar to Solute carrier family 23, member 1. The other portion is similar to PTPRT (Receptor type protein tyrosine phosphatase T).  In phylogenetic analysis, the PTPRT portion does not clade with the PTPR K/M/T/U group.  It formed a unique clade with Glean3_27290...both were renamed PTPRW.
###Gene_Info_Comments GLEAN3_06216 ###
GLEAN3_19209 is a significant partial duplicate prediction for GLEAN3_06216.
###Gene_Info_Comments GLEAN3_19209 ###
GLEAN3_19209 is a significant partial duplicate prediction for GLEAN3_06216.
###Gene_Info_Comments GLEAN3_22282 ###
May be the PTP domain of PTPRA. Partial gene.  Best hit was PTPSp8.
###Gene_Info_Comments GLEAN3_09557 ###
Partial sequence. PTPN3? PTPN4?
###Gene_Info_Comments GLEAN3_02877 ###
GLEAN3_02877, 06741, 17776, 17115 and 10544 are partially identical duplicate predictions of varying degree.
###Gene_Info_Comments GLEAN3_06741 ###
GLEAN3_02877, 06741, 17776, 17115 and 10544 are partially identical duplicate predictions of varying degree.
###Gene_Info_Comments GLEAN3_17776 ###
GLEAN3_02877, 06741, 17776, 17115 and 10544 are partially identical duplicate predictions of varying degree.
###Gene_Info_Comments GLEAN3_17115 ###
GLEAN3_02877, 06741, 17776, 17115 and 10544 are partially identical duplicate predictions of varying degree.
###Gene_Info_Comments GLEAN3_10544 ###
GLEAN3_02877, 06741, 17776, 17115 and 10544 are partially identical duplicate predictions of varying degree.
###Gene_Info_Comments GLEAN3_26886 ###
There may be an extra exon(s) in the prediction.
###Gene_Info_Comments GLEAN3_17940 ###
Partial sequence.
###Gene_Info_Comments GLEAN3_07363 ###
Partial sequence.
###Gene_Info_Comments GLEAN3_14401 ###
The following exon prediction is probably incorrect as it codes for another peptide sequence found in a different class of proteins.
>GLEAN3_14401|Scaffold442|42154|42721| DNA_SRC: Scaffold442 START: 42154 STOP: 42721 STRAND: + 
ATGGAAGAGCAAATCACCGCAAATCTTTTTAATTGCTCTATCATGAATGCTATGCCCTACGACGATGATA
ACTTTGAGTCGCCATCGACATCACCACCTACATACGCCGAGCTCACACCCGCTGTCAATCACACTTTCAA
TCACGGCAACATCAATTTTGATCACAACACCAGTTACGACGACGGCAACATAAGATATGAACACGACAAC
AGCAACCATAACTTTGACGAACAAGTACCCTTGAGCACCGCGCATCTTCTTGACATCTTATCGACGACGG
ATGTCGACATCAACAATATCGCAAATGACGGGGAGGAAGAGGGAAGCGACGAGGGGAGCGAACTCGCAGC
GTATCTCTTTCAGAATTCGGAATGGATTACGAATAACGCGACTTTAGACGATTCTCAATATTCAACTGCA
GTTAACGGTGACCCGCAACACTTTCAGAGTTGCTACACGAATAAGTCCATGGGCTATGGCAACACTTCGT
TCAACAGCAGCTATCATGAGGCTCACACCTTGCCACAAGTACCTTATTTTGGACATACTGATACTCAACA
TGCTCAAG

###Gene_Info_Comments GLEAN3_10793 ###
missing N-terminus residues, partial
###Gene_Info_Comments GLEAN3_02993 ###
GLEAN3_02993 has the first 1/3rd of the gene. GLEAN3_25005 has the last 1/3rd. Middle part appears to be missing.
###Gene_Info_Comments GLEAN3_25005 ###
GLEAN3_02993 has the first 1/3rd of the gene. GLEAN3_25005 has the last 1/3rd. Middle part appears to be missing.
###Gene_Info_Comments GLEAN3_23532 ###
Domains: DEATH, NACHT, LRRs

###Gene_Info_Comments GLEAN3_25680 ###
Domains: DEATH, NACHT, LRRs

###Gene_Info_Comments GLEAN3_28681 ###
Domains: DEATH, NACHT, LRRs
###Gene_Info_Comments GLEAN3_03200 ###
Domains: DEATH, NACHT, LRRs

###Gene_Info_Comments GLEAN3_09017 ###
Domains: DEATH, NACHT, LRRs
###Gene_Info_Comments GLEAN3_14128 ###
Domains: Signal peptide, DEATH, NACHT, LRRs.
###Gene_Info_Comments GLEAN3_04053 ###
Domains: DEATH, NACHT, LRRs
###Gene_Info_Comments GLEAN3_02641 ###
Domains: DEATH, NACHT, LRRs
###Gene_Info_Comments GLEAN3_26921 ###
Domains: DEATH, NACHT, LRRs.

###Gene_Info_Comments GLEAN3_25914 ###
Domains: DEATH, NACHT, LRRs.
###Gene_Info_Comments GLEAN3_15340 ###
Domains: DEATH, NACHT, LRRs.

###Gene_Info_Comments GLEAN3_17054 ###
Domains: DEATH, NACHT, LRRs.
###Gene_Info_Comments GLEAN3_17993 ###
Domains: DEATH, NACHT, LRRs.
###Gene_Info_Comments GLEAN3_01781 ###
Domains: DEATH, NACHT, LRRs.
###Gene_Info_Comments GLEAN3_08382 ###
Domains: DEATH, NACHT, LRRs.
###Gene_Info_Comments GLEAN3_09488 ###
Domains: DEATH, NACHT, LRRs.
###Gene_Info_Comments GLEAN3_23628 ###
Domains: DEATH, NACHT, LRRs.
###Gene_Info_Comments GLEAN3_09111 ###
Domains: DEATH, NACHT, LRRs.
###Gene_Info_Comments GLEAN3_22780 ###
Domains: DEATH, NACHT, LRRs.

###Gene_Info_Comments GLEAN3_02868 ###
Domains: DEATH, NACHT, LRRs.
###Gene_Info_Comments GLEAN3_20380 ###
Domains: DEATH, NACHT, LRRs.
###Gene_Info_Comments GLEAN3_27858 ###
Domains: DEATH, NACHT, LRRs.
###Gene_Info_Comments GLEAN3_15972 ###
Domains: DEATH, NACHT, LRRs.
###Gene_Info_Comments GLEAN3_02436 ###
Domains: DEATH, NACHT, LRRs.
###Gene_Info_Comments GLEAN3_24975 ###
Domains: DEATH, NACHT, LRRs.
###Gene_Info_Comments GLEAN3_03715 ###
Domains: DEATH, NACHT, LRRs.
###Gene_Info_Comments GLEAN3_21243 ###
Domains: DEATH, NACHT, LRRs.
###Gene_Info_Comments Sp-VEGF-3 ###
This model was created based on RACE sequence (5'end) and completed based on a Fgenesh++ model (S.P_Scaffold78.seq.N000007).
###Gene_Info_Comments GLEAN3_21148 ###
Found by Tandem Mass spectrometry of S. purpuratus sperm membranes
###Gene_Info_Comments GLEAN3_06122 ###
This gene was annotated based on a manual inspection of multiple protein sequence alignments and domain structures.

This and all other Sp-SARM-related genes cluster "tightly" with Sp-Sarm (GLEAN3_11042) and other Sarm genes in MSA trees built based on the sequence of their TIR domains. In addition, most of Sp-Sarm-related genes have similar domain compositions to that of Sarm genes (TIR, Armadillo/b-catenin and SAM domains), and those that do not are often located at the end of their respective scaffold and might be incomplete models. It should be noted, however, that the subfamily of Sarm-related genes shows a different domain organization than that of Sarms (including Sp-Sarm).
###Gene_Info_Comments GLEAN3_20008 ###
This gene was annotated based on a manual inspection of multiple protein sequence alignments and domain structures.

This and all other Sp-SARM-related genes cluster "tightly" with Sp-Sarm (GLEAN3_11042) and other Sarm genes in MSA trees built based on the sequence of their TIR domains. In addition, most of Sp-Sarm-related genes have similar domain compositions to that of Sarm genes (TIR, Armadillo/b-catenin and SAM domains), and those that do not are often located at the end of their respective scaffold and might be incomplete models. It should be noted, however, that the subfamily of Sarm-related genes shows a different domain organization than that of Sarms (including Sp-Sarm).

NB: Identical models were generated by other gene prediction protocols.
###Gene_Info_Comments GLEAN3_07020 ###
This gene was annotated based on a manual inspection of multiple protein sequence alignments and domain structures.

This and all other Sp-SARM-related genes cluster "tightly" with Sp-Sarm (GLEAN3_11042) and other Sarm genes in MSA trees built based on the sequence of their TIR domains. In addition, most of Sp-Sarm-related genes have similar domain compositions to that of Sarm genes (TIR, Armadillo/b-catenin and SAM domains), and those that do not are often located at the end of their respective scaffold and might be incomplete models. It should be noted, however, that the subfamily of Sarm-related genes shows a different domain organization than that of Sarms (including Sp-Sarm).
###Gene_Info_Comments GLEAN3_00764 ###
This gene was annotated based on a manual inspection of multiple protein sequence alignments and domain structures.

This and all other Sp-SARM-related genes cluster "tightly" with Sp-Sarm (GLEAN3_11042) and other Sarm genes in MSA trees built based on the sequence of their TIR domains. In addition, most of Sp-Sarm-related genes have similar domain compositions to that of Sarm genes (TIR, Armadillo/b-catenin and SAM domains), and those that do not are often located at the end of their respective scaffold and might be incomplete models. It should be noted, however, that the subfamily of Sarm-related genes shows a different domain organization than that of Sarms (including Sp-Sarm).

NB: The exon structure and embryonic expression of this gene are supported by the genome-wide tiling array hybridization data.
###Gene_Info_Comments GLEAN3_15127 ###
This gene was annotated based on a manual inspection of multiple protein sequence alignments and domain structures.

This and all other Sp-SARM-related genes cluster "tightly" with Sp-Sarm (GLEAN3_11042) and other Sarm genes in MSA trees built based on the sequence of their TIR domains. In addition, most of Sp-Sarm-related genes have similar domain compositions to that of Sarm genes (TIR, Armadillo/b-catenin and SAM domains), and those that do not are often located at the end of their respective scaffold and might be incomplete models. It should be noted, however, that the subfamily of Sarm-related genes shows a different domain organization than that of Sarms (including Sp-Sarm).

NB: This model lacks the SAM domains found in other members of the Sp-Sarm-related subfamily of genes (Sp-Sarm 1-4). Accordingly, it shows a weaker clustering with these models. The analysis of the scaffold in which this model is located does not reveal any obvious missing sequence, which would suggest a true loss of this domain.
###Gene_Info_Comments GLEAN3_21841 ###
This gene was modified and annotated based on a manual inspection of multiple protein sequence alignments and domain structures.

This and all other Sp-SARM-related genes cluster "tightly" with Sp-Sarm (GLEAN3_11042) and other Sarm genes in MSA trees built based on the sequence of their TIR domains. In addition, most of Sp-Sarm-related genes have similar domain compositions to that of Sarm genes (TIR, Armadillo/b-catenin and SAM domains), and those that do not are often located at the end of their respective scaffold and might be incomplete models. It should be noted, however, that the subfamily of Sarm-related genes shows a different domain organization than that of Sarms (including Sp-Sarm).

This model was modified based on otherwise identical Fgenesh++/AB predictions that incorporate an additional C-term exon. This exon is supported by the tiling array data, and it codes for a (sub-optimal) SAM domain, which is present in other members of this subfamily of Sp-Sarm related genes.

NB: The structure and embryonic expression of this gene is supported by the embryonic tiling array data.
###Gene_Info_Comments GLEAN3_18859 ###
This gene was annotated based on a manual inspection of multiple protein sequence alignments and domain structures.

This and all other Sp-SARM-related genes cluster "tightly" with Sp-Sarm (GLEAN3_11042) and other Sarm genes in MSA trees built based on the sequence of their TIR domains. In addition, most of Sp-Sarm-related genes have similar domain compositions to that of Sarm genes (TIR, Armadillo/b-catenin and SAM domains), and those that do not are often located at the end of their respective scaffold and might be incomplete models (NB: this model is one example, it is located at the end of the scaffold and there might be missing N-ter sequence). It should be noted, however, that the subfamily of Sarm-related genes shows a different domain organization than that of Sarms (including Sp-Sarm).
###Gene_Info_Comments GLEAN3_27640 ###
This gene was modified and annotated based on a manual inspection of multiple protein sequence alignments and domain structures.

This and all other Sp-SARM-related genes cluster "tightly" with Sp-Sarm (GLEAN3_11042) and other Sarm genes in MSA trees built based on the sequence of their TIR domains. In addition, most of Sp-Sarm-related genes have similar domain compositions to that of Sarm genes (TIR, Armadillo/b-catenin and SAM domains), and those that do not are often located at the end of their respective scaffold and might be incomplete models. It should be noted, however, that the subfamily of Sarm-related genes shows a different domain organization than that of Sarms (including Sp-Sarm).

This model was modified based on an otherwise identical BCM:Gene prediction that incorporates an additional N-term exon. This additional exon codes for sequence that includes a sub-optimal prediction for an Armadillo/b-catenin domain, which is present in other Sp-Sarm-related genes. The additional exon corresponds to an adjacent glean model (GLEAN3_27639) and thus the modified version of this model fuses both glean models.
###Gene_Info_Comments GLEAN3_27639 ###
This model was fused to a modified version of GLEAN3_27640. Please see GLEAN3_27640 for details.
###Gene_Info_Comments GLEAN3_18168 ###
This gene was annotated based on a manual inspection of multiple protein sequence alignments and domain structures.

This and all other Sp-SARM-related genes cluster "tightly" with Sp-Sarm (GLEAN3_11042) and other Sarm genes in MSA trees built based on the sequence of their TIR domains. In addition, most of Sp-Sarm-related genes have similar domain compositions to that of Sarm genes (TIR, Armadillo/b-catenin and SAM domains), and those that do not are often located at the end of their respective scaffold and might be incomplete models. It should be noted, however, that the subfamily of Sarm-related genes shows a different domain organization than that of Sarms (including Sp-Sarm).
###Gene_Info_Comments GLEAN3_03495 ###
This gene was annotated based on a manual inspection of multiple protein sequence alignments and domain structures.

This and all other Sp-SARM-related genes cluster "tightly" with Sp-Sarm (GLEAN3_11042) and other Sarm genes in MSA trees built based on the sequence of their TIR domains. In addition, most of Sp-Sarm-related genes have similar domain compositions to that of Sarm genes (TIR, Armadillo/b-catenin and SAM domains), and those that do not are often located at the end of their respective scaffold and might be incomplete models. It should be noted, however, that the subfamily of Sarm-related genes shows a different domain organization than that of Sarms (including Sp-Sarm).

NB: The exon structure and embryonic expression of this model are strongly supported by the tiling-array data.
###Gene_Info_Comments GLEAN3_04557 ###
This gene was annotated based on a manual inspection of multiple protein sequence alignments and domain structures.

This and all other Sp-SARM-related genes cluster "tightly" with Sp-Sarm (GLEAN3_11042) and other Sarm genes in MSA trees built based on the sequence of their TIR domains. In addition, most of Sp-Sarm-related genes have similar domain compositions to that of Sarm genes (TIR, Armadillo/b-catenin and SAM domains), and those that do not are often located at the end of their respective scaffold and might be incomplete models. It should be noted, however, that the subfamily of Sarm-related genes shows a different domain organization than that of Sarms (including Sp-Sarm).

NB: The exon structure and embryonic expression of this model are strongly supported by the tiling array data.
###Gene_Info_Comments GLEAN3_04107 ###
This gene was annotated based on a manual inspection of multiple protein sequence alignments and domain structures.

This and all other Sp-SARM-related genes cluster "tightly" with Sp-Sarm (GLEAN3_11042) and other Sarm genes in MSA trees built based on the sequence of their TIR domains. In addition, most of Sp-Sarm-related genes have similar domain compositions to that of Sarm genes (TIR, Armadillo/b-catenin and SAM domains), and those that do not are often located at the end of their respective scaffold and might be incomplete models. It should be noted, however, that the subfamily of Sarm-related genes shows a different domain organization than that of Sarms (including Sp-Sarm).
###Gene_Info_Comments GLEAN3_08302 ###
This gene was annotated based on a manual inspection of multiple protein sequence alignments and domain structures.

This and all other Sp-SARM-related genes cluster "tightly" with Sp-Sarm (GLEAN3_11042) and other Sarm genes in MSA trees built based on the sequence of their TIR domains. In addition, most of Sp-Sarm-related genes have similar domain compositions to that of Sarm genes (TIR, Armadillo/b-catenin and SAM domains), and those that do not are often located at the end of their respective scaffold and might be incomplete models (this model is one such example; there may be some missing N-ter sequence). It should be noted, however, that the subfamily of Sarm-related genes shows a different domain organization than that of Sarms (including Sp-Sarm).

NB: The exon structure and embryonic expression of this model are strongly supported by tiling array data.
###Gene_Info_Comments GLEAN3_07088 ###
This gene was annotated based on a manual inspection of multiple protein sequence alignments and domain structures.

This and all other Sp-SARM-related genes cluster "tightly" with Sp-Sarm (GLEAN3_11042) and other Sarm genes in MSA trees built based on the sequence of their TIR domains. In addition, most of Sp-Sarm-related genes have similar domain compositions to that of Sarm genes (TIR, Armadillo/b-catenin and SAM domains), and those that do not are often located at the end of their respective scaffold and might be incomplete models. It should be noted, however, that the subfamily of Sarm-related genes shows a different domain organization than that of Sarms (including Sp-Sarm).

This model also includes a predicted transmembrane domain, a feature not seen in other members of the subfamily of Sp-Sarm-related models.

NB: The exon structure and embryonic expression of this model are strongly supported by tiling array data. Moreover, some tiling array signal falls into introns of this prediction, and there could therefore exist more exonic sequence that was not called for by the prediction protocols.
###Gene_Info_Comments GLEAN3_16014 ###
This gene was annotated based on a manual inspection of multiple protein sequence alignments and domain structures.

Sp-TIR-containing models do not show any characteristic domain composition, nor do they cluster significantly with other TIR-containing genes (in S.purpuratus or any other species).

NB: This model is located in a small scaffold. Therefore, there could be missing sequence towards both the N-ter and C-ter ends of the model.
###Gene_Info_Comments GLEAN3_13299 ###
This seems a duplication of GLEAN3_07952. Please see GLEAN_07952 for details.

Note that the adjacent GLEAN3_13298 is also most likely a duplication of GLEAN3_07951.
###Gene_Info_Comments GLEAN3_07952 ###
This gene was annotated based on a manual inspection of multiple protein sequence alignments and domain structures.

Sp-TIR-containing models do not show any characteristic domain composition, nor do they cluster significantly with other TIR-containing genes (in S.purpuratus or any other species). The TIR domain in this model showed a weak co-segregation with TIR-domains from MyD88 genes in a MSA tree. However its domain composition/structure does not resemble any of the MyD88 genes.

NB: The exon structure and embryonic expression of this model are partly supported by tiling array data. Also note this model seems duplicated in GLEAN 3_13299 (99% seq identity at aminoacid level).
###Gene_Info_Comments GLEAN3_12671 ###
This gene was annotated based on a manual inspection of multiple protein sequence alignments and domain structures.

Sp-TIR-containing models do not show any characteristic domain composition, nor do they cluster significantly with other TIR-containing genes (in S.purpuratus or any other species). This model, however, co-clustered tightly with three other Sp-TIR-containing models (Sp-TIR-c5,6,7). Despite this strong co-seggregation, Sp-TIR-c4-7 do not show any characteristic domain structure/composition.
###Gene_Info_Comments GLEAN3_14926 ###
This gene was annotated based on a manual inspection of multiple protein sequence alignments and domain structures.

Sp-TIR-containing models do not show any characteristic domain composition, nor do they cluster significantly with other TIR-containing genes (in S.purpuratus or any other species).
###Gene_Info_Comments GLEAN3_03608 ###
This gene was annotated based on a manual inspection of multiple protein sequence alignments and domain structures.

Sp-TIR-containing models do not show any characteristic domain composition, nor do they cluster significantly with other TIR-containing genes (in S.purpuratus or any other species). This model, however, co-clustered tightly with three other Sp-TIR-containing models (Sp-TIR-c4,6,7). Despite this strong co-seggregation, Sp-TIR-c4-7 do not show any characteristic domain structure/composition.

NB: The exon structure and embryonic expression of this model are strongly supported by tiling array data.
###Gene_Info_Comments GLEAN3_13352 ###
This gene was annotated based on a manual inspection of multiple protein sequence alignments and domain structures.

Sp-TIR-containing models do not show any characteristic domain composition, nor do they cluster significantly with other TIR-containing genes (in S.purpuratus or any other species). This model, however, co-clustered tightly with three other Sp-TIR-containing models (Sp-TIR-c4,5,7). Despite this strong co-seggregation, Sp-TIR-c4-7 do not show any characteristic domain structure/composition.

NB: This model is located at the end of a scaffold and represents a partial model (there is no ATG at th N-ter of the sequence).
###Gene_Info_Comments GLEAN3_20131 ###
This gene was annotated based on a manual inspection of multiple protein sequence alignments and domain structures.

Sp-TIR-containing models do not show any characteristic domain composition, nor do they cluster significantly with other TIR-containing genes (in S.purpuratus or any other species). This model, however, co-clustered tightly with three other Sp-TIR-containing models (Sp-TIR-c4-6). Despite this strong co-seggregation, Sp-TIR-c4-7 do not show any characteristic domain structure/composition.

NB: Some exons of this model show very strong signal in the tiling array embryonic hybridization experiment, which supports the embryonic expression of this model.
###Gene_Info_Comments GLEAN3_12667 ###
This model was annotated based on a manual inspection of sequence alignments and domain composition.

We named this model with the sufix "a" (and not a number) to avoid the misleading assumption that this model would represent any specific ortholog of vertebrate SAA genes (e.g. SAA1).

This model is very similar in sequence to the adjacent GLEAN3_12668 (Sp-Saa-b). However, there are enough differences between them both at the nt and aa level, and therefore they most likely represent two different genes. 
###Gene_Info_Comments GLEAN3_12668 ###
This model was annotated based on a manual inspection of sequence alignments and domain composition.

We named this model with the sufix "b" (and not a number) to avoid the misleading assumption that this model would represent any specific ortholog of vertebrate SAA genes (e.g. SAA2).

This model is very similar in sequence to the adjacent GLEAN3_12667 (Sp-Saa-a). However, there are enough differences between them both at the nt and aa level, and therefore they most likely represent two different genes. 
###Gene_Info_Comments GLEAN3_14752 ###
This model was annotated based on reciprocal blasting and similarity of domain structure/organization with vertebrate Nck.

Its embryonic expression is partly supported by signal from the tiling array hybridization data.
###Gene_Info_Comments GLEAN3_00045 ###
Shows significant homology to human Ficolin-1.  Sequence overlap occurs in the fibringen domain.
###Gene_Info_Comments GLEAN3_20449 ###
 fragment
###Gene_Info_Comments GLEAN3_20501 ###
 fragment
###Gene_Info_Comments GLEAN3_20584 ###
 fragment
###Gene_Info_Comments GLEAN3_20736 ###
 fragment
###Gene_Info_Comments GLEAN3_20959 ###
 tiny fragment
###Gene_Info_Comments GLEAN3_21072 ###
 fragment
###Gene_Info_Comments GLEAN3_21329 ###
 tiny fragment
###Gene_Info_Comments GLEAN3_21353 ###
 fragment
###Gene_Info_Comments GLEAN3_21423 ###
 partial, missing C-terminus half
###Gene_Info_Comments GLEAN3_21534 ###
 fragment
###Gene_Info_Comments GLEAN3_21540 ###
 tiny fragment
###Gene_Info_Comments GLEAN3_21636 ###
 fragment
###Gene_Info_Comments GLEAN3_21657 ###
 missing N- and C-terminus
###Gene_Info_Comments GLEAN3_21711 ###
 fragment
###Gene_Info_Comments GLEAN3_21836 ###
 only N-terminal fragment
###Gene_Info_Comments GLEAN3_21984 ###
 tiny fragment
###Gene_Info_Comments GLEAN3_22014 ###
 fragment
###Gene_Info_Comments GLEAN3_22413 ###
 fragment
###Gene_Info_Comments GLEAN3_23181 ###
 fragment
###Gene_Info_Comments GLEAN3_23246 ###
 fragment
###Gene_Info_Comments GLEAN3_23276 ###
 fragment
###Gene_Info_Comments GLEAN3_23521 ###
 tiny fragment
###Gene_Info_Comments GLEAN3_23818 ###
 fragment, missing N-terminus half
###Gene_Info_Comments GLEAN3_23997 ###
 fragment
###Gene_Info_Comments GLEAN3_24225 ###
 fragment
###Gene_Info_Comments GLEAN3_24330 ###
 fragment, missing N-terminus region and a stretch in middle
###Gene_Info_Comments GLEAN3_24422 ###
 tiny fragment
###Gene_Info_Comments GLEAN3_24573 ###
 fragment
###Gene_Info_Comments GLEAN3_24701 ###
 tiny fragment
###Gene_Info_Comments GLEAN3_24730 ###
 tiny fragment
###Gene_Info_Comments GLEAN3_24872 ###
 small fragment
###Gene_Info_Comments GLEAN3_24892 ###
 fragment
###Gene_Info_Comments GLEAN3_24939 ###
 fragment
###Gene_Info_Comments GLEAN3_25048 ###
 fragment
###Gene_Info_Comments GLEAN3_25189 ###
 fragment
###Gene_Info_Comments GLEAN3_25694 ###
 fragment
###Gene_Info_Comments GLEAN3_25798 ###
 missing C-terminus residues
###Gene_Info_Comments GLEAN3_25847 ###
 small fragment
###Gene_Info_Comments GLEAN3_25877 ###
 small fragment
###Gene_Info_Comments GLEAN3_26128 ###
 fragment
###Gene_Info_Comments GLEAN3_26792 ###
 fragment
###Gene_Info_Comments GLEAN3_27323 ###
 missing N-terminus residues
###Gene_Info_Comments GLEAN3_27582 ###
 fragment
###Gene_Info_Comments GLEAN3_28841 ###
 fragment
###Gene_Info_Comments GLEAN3_11626 ###
 fragment
###Gene_Info_Comments GLEAN3_19005 ###
 fragment
###Gene_Info_Comments GLEAN3_03531 ###
 extra C-terminus residues
###Gene_Info_Comments GLEAN3_04263 ###
 fragment
###Gene_Info_Comments GLEAN3_09652 ###
 fragment
###Gene_Info_Comments GLEAN3_09845 ###
 fragment
###Gene_Info_Comments GLEAN3_15734 ###
 tiny fragment
###Gene_Info_Comments GLEAN3_17456 ###
 fragment
###Gene_Info_Comments GLEAN3_18249 ###
 fragment
###Gene_Info_Comments GLEAN3_24068 ###
 fragment
###Gene_Info_Comments GLEAN3_26707 ###
 fragment
###Gene_Info_Comments GLEAN3_28520 ###
 fragment
###Gene_Info_Comments GLEAN3_28870 ###
 fragment
###Gene_Info_Comments GLEAN3_00214 ###
 missing some N-terminus residues
###Gene_Info_Comments GLEAN3_01700 ###
 fragment
###Gene_Info_Comments GLEAN3_02107 ###
 fragment
###Gene_Info_Comments GLEAN3_02203 ###
 fragment
###Gene_Info_Comments GLEAN3_02274 ###
 fragment
###Gene_Info_Comments GLEAN3_02980 ###
 tiny fragment
###Gene_Info_Comments GLEAN3_03017 ###
 fragment
###Gene_Info_Comments GLEAN3_03265 ###
 fragment
###Gene_Info_Comments GLEAN3_03408 ###
 missing N-terminus region
###Gene_Info_Comments GLEAN3_03563 ###
 fragment
###Gene_Info_Comments GLEAN3_03804 ###
 fragment
###Gene_Info_Comments GLEAN3_04207 ###
 fragment
###Gene_Info_Comments GLEAN3_04637 ###
 fragment
###Gene_Info_Comments GLEAN3_05345 ###
 partial
###Gene_Info_Comments GLEAN3_05799 ###
 fragment
###Gene_Info_Comments GLEAN3_06178 ###
 fragment
###Gene_Info_Comments GLEAN3_07650 ###
 fragment
###Gene_Info_Comments GLEAN3_07810 ###
 fragment
###Gene_Info_Comments GLEAN3_07990 ###
 small fragment
###Gene_Info_Comments GLEAN3_08078 ###
 fragment
###Gene_Info_Comments GLEAN3_08090 ###
 tiny fragment
###Gene_Info_Comments GLEAN3_08096 ###
 fragment
###Gene_Info_Comments GLEAN3_08341 ###
 partial
###Gene_Info_Comments GLEAN3_08681 ###
 fragment
###Gene_Info_Comments GLEAN3_09010 ###
 fragment
###Gene_Info_Comments GLEAN3_09101 ###
 fragment
###Gene_Info_Comments GLEAN3_09349 ###
 fragment
###Gene_Info_Comments GLEAN3_09393 ###
 fragment
###Gene_Info_Comments GLEAN3_09442 ###
 partial, missing N-terminus region
###Gene_Info_Comments GLEAN3_09533 ###
 partial
###Gene_Info_Comments GLEAN3_09631 ###
 fragment
###Gene_Info_Comments GLEAN3_09694 ###
 tiny fragment
###Gene_Info_Comments GLEAN3_04300 ###
This is a partial sequence.  Does not form a clade with human Ppm1g in phylogenetic analysis of the PP2C subfamily of PPM phosphatases.   Most similar to GLEAN3_14625.
###Gene_Info_Comments GLEAN3_05327 ###
Partial sequence.  This is N-terminal to GLEAN3_04300. Identification is partially based on the EST sequence matching these two sequences.
###Gene_Info_Comments GLEAN3_28868 ###
Partial sequence.
###Gene_Info_Comments GLEAN3_06956 ###
Partial sequence containing PP2Ac domain.
###Gene_Info_Comments GLEAN3_19872 ###
Most of the exons encoded; exon 10  is identical to GLEAN3_04123 exon 2 while the final exon (11) is also in GLEAN3_04123 (bases 2099-2275).

###Gene_Info_Comments GLEAN3_17129 ###
Domains: NACHT, LRRs. This gene model is at the end of a scaffold. Could be incomplete.
###Gene_Info_Comments GLEAN3_00896 ###
Domains: DEATH, NACHT, LRRs.
###Gene_Info_Comments GLEAN3_04043 ###
Domains: DEATH, NACHT, LRRs.
###Gene_Info_Comments GLEAN3_14112 ###
Domains: DEATH, NACHT, LRRs.
###Gene_Info_Comments GLEAN3_17038 ###
Domains: DEATH, NACHT, LRRs.
###Gene_Info_Comments GLEAN3_17708 ###
Domains: DEATH, NACHT, LRRs.
###Gene_Info_Comments GLEAN3_22294 ###
Domains: DEATH, NACHT, LRRs.
###Gene_Info_Comments GLEAN3_01392 ###
NOTE: Sequence and alignment suggest this is the 5' end of a single eIF-5B gene, comprised of GLEANs 01392 (C-Term) and 01393 (N-term).
###Gene_Info_Comments GLEAN3_01393 ###
NOTE: Sequence and alignment suggest this is the 5' end of a single eIF5B gene, comprised of GLEANs 01392 (C-Term) and 01393 (N-term).

###Gene_Info_Comments GLEAN3_01394 ###
NOTE-PARTIALseq--identical to middle of eIF5B (glean 01392)
###Gene_Info_Comments GLEAN3_00506 ###
PARTIAL.  When aligned to human SELB, this contains the start site, but ends before Glean22870 begins --(22870) is the "REST" of SELB
###Gene_Info_Comments GLEAN3_22870 ###
PARTIAL.  When aligned to human SELB, this contains most of SELB, but without the 5'end.  Note--GLEAN 00506 appears to contain the start site and ends before this glean begins.  
###Gene_Info_Comments GLEAN3_16991 ###
NOTE partial sequence which is a duplication of the middle of GLEAN 22870 (the bulk of SELB).  
###Gene_Info_Comments GLEAN3_03646 ###
GLEAN3_09292 is identical to 3' end of this sequence
###Gene_Info_Comments GLEAN3_09292 ###
partial--Duplication of 3' part of Glean 03646-eIF2alpha
###Gene_Info_Comments GLEAN3_06351 ###
Sp1200 Bacterially activated coelomocyte, arrayed
cDNA clone Sp1200 5' similar to amassin, mRNA sequence.
###Gene_Info_Comments GLEAN3_23924 ###
This is the same as GLEAN3_06351, with the addition of an extra exon, and some additional sequences in one of the other exons. Considered to be an allele of GLEAN3_06351. mRNA obtained from coelomocytes exposed to bateria.
###Gene_Info_Comments GLEAN3_23548 ###
Significant homology to human ficolin2 (FCN2).  Overlaps occur at fibrinogen C-terminal domain (Expasy annotation)
###Gene_Info_Comments GLEAN3_04912 ###
GLEAN3_02231 is a duplicate prediction for GLEAN3_04912.
###Gene_Info_Comments GLEAN3_21032 ###
GLEAN3_21032 is a partial duplicate prediction for GLEAN3_08786.
###Gene_Info_Comments GLEAN3_08786 ###
GLEAN3_21032 is a partial duplicate prediction for GLEAN3_08786.
###Gene_Info_Comments GLEAN3_10773 ###
First ~400 aa from the human protein have no homology in urchin GLEAN's.
###Gene_Info_Comments GLEAN3_05511 ###
Identified from cDNA clone from gram negative bacterially activated coelomocyte of sea urchin.
###Gene_Info_Comments GLEAN3_13960 ###
Identified from cDNA clone from gram negative bacterially activated coelomocyte of sea urchin.
###Gene_Info_Comments GLEAN3_27883 ###
GLEAN3_13201 has the first part of the DHX36 gene. GLEAN3_27883 has the rest of the gene.
###Gene_Info_Comments GLEAN3_13201 ###
GLEAN3_13201 has the first part of the DHX36 gene. GLEAN3_27883 has the rest of the gene.
###Gene_Info_Comments GLEAN3_27440 ###
Partial duplicate prediction for GLEAN3_27883.
###Gene_Info_Comments GLEAN3_13786 ###
There are extra exon(s) at the end of the GLEAN.
###Gene_Info_Comments GLEAN3_08600 ###
GLEAN3_08600 is a duplicate partial prediction for GLEAN3_13786.
###Gene_Info_Comments GLEAN3_14106 ###
GLEAN3_14106 shows a significant partial overlap with GLEAN3_26336 which codes for the first part of DHX34. First 170 aa from GLEAN3_14106 show no significant homology to human proteins.
###Gene_Info_Comments GLEAN3_08946 ###
GLEAN3_14505 has the first part of DDX56 and GLEAN3_08946 has the rest.
###Gene_Info_Comments GLEAN3_14505 ###
GLEAN3_14505 has the first part of DDX56 and GLEAN3_08946 has the rest.
###Gene_Info_Comments GLEAN3_12487 ###
GLEAN3_12487 is a partial duplicate prediction for GLEAN3_08946.
###Gene_Info_Comments GLEAN3_16488 ###
GLEAN3_16488 is a partial duplicate prediction for GLEAN3_16467.
###Gene_Info_Comments GLEAN3_02662 ###
partial sequence containing TGF-beta domain, it has the identical c terminal end as Glean3_12786, but appears to be different towards it's N terminal end
###Gene_Info_Comments GLEAN3_22079 ###
contains TGFbeta domain
###Gene_Info_Comments GLEAN3_28397 ###
GLEAN3_16552 has one part and GLEAN3_28397 appears to have the last part of DDX42.
###Gene_Info_Comments GLEAN3_16552 ###
GLEAN3_16552 has one part and GLEAN3_28397 appears to have the last part of DDX42.
###Gene_Info_Comments GLEAN3_21726 ###
contains TGFbeta_propeptide domain
###Gene_Info_Comments GLEAN3_22653 ###
partial sequence, contains TGFbeta_propeptide domain
###Gene_Info_Comments GLEAN3_26612 ###
Similar to other catalitic subunits of telomerase
###Gene_Info_Comments GLEAN3_09443 ###
e val for NP_065867= 9e-54; kinesin family member 17 [Homo sapiens]. 
In the alignment with the best human hit, there is a gap of 13 aa in Glean3_09443 which corresponds to a gap of 374 aa in the NP_065867.
Annotation by RA Obar, RL Morris, LE Shorey, SA Tower, and B Rossetti.
###Gene_Info_Comments GLEAN3_26884 ###
This GLEAN3 sequence is apparently missing some of the beginning sequences of the other Sp-amassins. Note the sequences in the alignments. The data against which it was compared came from the coelomocytes of a bacterially activated sea urchin.
###Gene_Info_Comments GLEAN3_00409 ###
This is likely a duplication of GLEAN3_13950 (98+% ID ata aa level). Please see GLEAN3_13950 for details.
###Gene_Info_Comments GLEAN3_05998 ###
GLEAN3_05998 and GLEAN3_19123 code for DDX52. They have a partial identical overlap that may have a haplotype as well.
###Gene_Info_Comments GLEAN3_19123 ###
GLEAN3_05998 and GLEAN3_19123 code for DDX52. They have a partial identical overlap that may have a haplotype as well.
###Gene_Info_Comments GLEAN3_00362 ###
Potential MASP
###Gene_Info_Comments GLEAN3_03126 ###
This model was annotated on a manual inspection of sequence alignments and domain structure.

This model shows a very high degree of similarity to part of an adjacent model (GLEAN3_03127) corresponding to one of the Sp-SRCR genes. The region of similarity includes both the extra and intracellular juxtamembrane regions, the transmembrane domain and a long cytoplasmic tail of low complexity, but excludes the SRCR domains, which are replaced by 2xEGF + 2xIG domains.

The position and orientation of both models in a single uninterrupted contig suggests that they did not originate as an assembly problem, but that they may represent a true gene duplication/divergence event. Of note, a very similar situation can be seen with GLEAN3_22566/GLEAN3_22567.
###Gene_Info_Comments GLEAN3_22566 ###
This model was annotated on a manual inspection of sequence alignments and domain structure.

This model shows a very high degree of similarity to part of three adjacent models (GLEAN3_22567-9) corresponding to Sp-SRCR genes. The region of similarity includes both the extra and intracellular juxtamembrane regions, the transmembrane domain and a long cytoplasmic tail of low complexity, but excludes the SRCR domains, which are replaced by EGF+IG domains.

The position and orientation of these models in long, uninterrupted contigs suggests that they did not originate as an assembly problem, but that they may represent true gene duplication/divergence events. Of note, a very similar situation was seen with GLEAN3_03126/GLEAN3_03127.
###Gene_Info_Comments GLEAN3_05343 ###
An incomplete ORC6 sequence, missing exon 3 is also present in GLEAN_09374 (see comments on this Glean) 
###Gene_Info_Comments GLEAN3_09374 ###
This GLEAN3 model encodes a partial ORC6 sequence in which exon 3 is probably missing due to an inappropriate fusion of contigs. See GLEAN_05343 for complete ORC6.
###Gene_Info_Comments GLEAN3_11491 ###
Partial MCM2 sequence likely due to inappropriate contig fusion. View GLEAN_06096 for full length sequence.
###Gene_Info_Comments GLEAN3_06096 ###
An incomplete MCM2 gene sequence is also find on GLEAN3_11491
###Gene_Info_Comments GLEAN3_06848 ###
view GLEAN_12983 for complete CDS
###Gene_Info_Comments GLEAN3_12983 ###
This Glean3 model encodes the entire MCM3 gene sequence; however the first exon probably not belongs to this protein, being artifactually fused to the MCM3 gene in the scaffold. 
###Gene_Info_Comments GLEAN3_03296 ###
This is highly correlated with two highly conserved regions: 
1) cytidine_deaminase-like region (2e-14)
2) SNF7 family (9e-22)
But an alternative splicing gives different regions with less highly correlated scores; thus, this one is considered more likely
###Gene_Info_Comments GLEAN3_28359 ###
Highly correlated with highly conserved region identified as nucleoside_deaminase. The Accession number is from a zebra fish protein that had not been identified.
###Gene_Info_Comments GLEAN3_07007 ###
An incomplete MCM8 sequence is also found in GLEAN3_07296
###Gene_Info_Comments GLEAN3_07296 ###
Partial sequences. See anotation of GLEAN3_07007 for the MCM8 full-length coding sequence
###Gene_Info_Comments GLEAN3_23032 ###
The scaffold assembly should be revised.
Another GLEAN encodes the CDC45 sequence: GLEAN3_24816, in that case also exons are missing or artefactually assembled.

###Gene_Info_Comments GLEAN3_24816 ###
The scaffold assembly should be revised.
Another GLEAN encodes the CDC45 sequence: GLEAN3_23032, in that case also exons are missing or artefactually assembled.

###Gene_Info_Comments GLEAN3_26280 ###
This is KRP95 of kinesin-2 cloned from Sp by Cole et al, 1993.
e val = 3e-37 for NP_004789, KIF3B [Homo sapiens]
This is amino terminal portion of protein in 11 exons.  Two other scaffolds required to complete the gene.  KRP95 continues on scaffold91496 (one exon) then scaffold58510 (eight exons).
Annotated by RA Obar, RL Morris, B Rossetti, AM Musante, and EJ Jin.
###Gene_Info_Comments GLEAN3_21526 ###
Exon 2 is missing from Scaffold1158 due to a string of N's but is present on Scaffold181669.
###Gene_Info_Comments GLEAN3_06237 ###
Segment of KRP95, annotated fully in GLEAN3_26280.
Annotation by RL Morris, R.A.Obar, and B Rossetti.
###Gene_Info_Comments GLEAN3_09764 ###
Segment of KRP95, annotated fully in GLEAN3_26280.
Annotation by RL Morris, R.A.Obar, and B Rossetti.
###Gene_Info_Comments GLEAN3_25077 ###
Genscan model may be more accurate.
Domains: DEATH, NACHT, LRRs

###Gene_Info_Comments GLEAN3_27513 ###
Domains: DEATH, NACHT, LRRs.
###Gene_Info_Comments GLEAN3_01054 ###
Domains: DEATH, NACHT, LRRs.

###Gene_Info_Comments GLEAN3_16921 ###
Domains: DEATH, NACHT, LRRs.
This gene could be incomplete. The scaffold is quite small.

###Gene_Info_Comments GLEAN3_08547 ###
Domains: DEATH, NACHT, LRRs.
###Gene_Info_Comments GLEAN3_13504 ###
Domains: DEATH, NACHT, LRRs.
###Gene_Info_Comments GLEAN3_17341 ###
Domains: DEATH, NACHT, LRRs.
###Gene_Info_Comments GLEAN3_01548 ###
Domains: NACHT, LRRs. Genscan model has DEATH domain but no methionine. Likely incomplete.
###Gene_Info_Comments GLEAN3_21447 ###
Domains: DEATH, NACHT, LRRs.
###Gene_Info_Comments GLEAN3_02372 ###
This gene model is likely a fusion of two models. The Fgenesh model has DEATH domain and NACHT.
###Gene_Info_Comments GLEAN3_02423 ###
Domains: DEATH, NACHT, LRRs. This model could be incomplete, the scaffold is very short.
###Gene_Info_Comments GLEAN3_26071 ###
Domains: DEATH, NACHT, LRRs.
###Gene_Info_Comments GLEAN3_21478 ###
Domains: CARD, DEATH, NACHT, LRRs.
Some LRRs could be missing since the scaffold is short.

###Gene_Info_Comments GLEAN3_26020 ###
Domains: NACHT, LRRs. Fgenesh model has DEATH domain.
###Gene_Info_Comments GLEAN3_22394 ###
Domains: DEATH, NACHT, LRRs. Genscan model has additional exons. This gene model could be incomplete, this scaffold is small.
###Gene_Info_Comments GLEAN3_08431 ###
Domains: DEATH, NACHT, LRRs, TM.
The Genscan prediction does not have the TM.
###Gene_Info_Comments GLEAN3_24649 ###
Domains: NACHT, LRRs.
Genscan model has different exon/intron structure and contains DEATH domain. Some LRRs could be missing since this model is at the end of a scaffold.
###Gene_Info_Comments GLEAN3_15105 ###
Domains: DEATH, NACHT, LRRs.
This gene model is on a short scaffold and could be incomplete.
###Gene_Info_Comments GLEAN3_03934 ###
Domains: Signal peptide, DEATH, NACHT, LRRs.
The Fgenesh model has 2 additional exons: could have more LRRs.
###Gene_Info_Comments GLEAN3_05462 ###
Domains: DEATH, NACHT, LRRs.
This gene model is on a small scaffold. It could be incomplete.
###Gene_Info_Comments GLEAN3_02183 ###
Shows essentially identical sequences to the XP_786692 accession number for the ADEAMc (adenosine deaminase tRNA specific) gene for the Sea Urchin as predicted by bioinformatics data.
###Gene_Info_Comments GLEAN3_04872 ###
Domains: Signal peptide, DEATH, NACHT, LRRs.
###Gene_Info_Comments GLEAN3_02962 ###
Domains: DEATH, NACHT, LRRs.

###Gene_Info_Comments GLEAN3_23550 ###
Domains: DEATH, NACHT, LRRs.

###Gene_Info_Comments GLEAN3_25138 ###
Domains: DEATH, NACHT.
The Fgenesh model has 4 additional exons and has LRRs.
###Gene_Info_Comments GLEAN3_28387 ###
Domains: DEATH, NACHT, LRRs.
###Gene_Info_Comments GLEAN3_02231 ###
GLEAN3_02231 is a duplicate prediction for GLEAN3_04912.
###Gene_Info_Comments GLEAN3_04163 ###
GLEAN3_04163 is a duplicate prediction for GLEAN3_21088.
###Gene_Info_Comments GLEAN3_04597 ###
GLEAN3_04597 is a duplicate prediction for GLEAN3_10556.
###Gene_Info_Comments GLEAN3_22130 ###
Domains: Signal peptide, DEATH, NACHT, LRRs.

###Gene_Info_Comments GLEAN3_05004 ###
GLEAN3_05004 is a partial duplicate prediction for GLEAN3_03407
###Gene_Info_Comments GLEAN3_05099 ###
Partial sequence.
###Gene_Info_Comments GLEAN3_12661 ###
looks like C-terminal region of a myosin 10

PH-MyTH4-B41 - missing N-terminal region containing MySc and IQ repeats
###Gene_Info_Comments GLEAN3_18982 ###
looks like C-terminal region of a myosin 10

PHx2-MyTH4-B41 - missing N-terminal region containing MySc and IQ repeats
###Gene_Info_Comments GLEAN3_23549 ###
looks like a partial of myosin 10
MySc-IQx3-PHx2-MyTH4 - missing B41 domain at C-terminus
###Gene_Info_Comments GLEAN3_26077 ###
looks like a C-terminal fragment of myosin 7 or 15

SH3-MyTH4-B41

missing N-terminal part with MySc-IQ-IQ-IQ-MyTH4-B41
###Gene_Info_Comments GLEAN3_28273 ###
looks like a fragment of a myosin 10

PH-PH-MyTH4

missing both N-terminus - MySc plus IQ repeats and C-terminus - B41 



###Gene_Info_Comments GLEAN3_06732 ###
just a reelin domain - nothing else - could be part of a reelin
###Gene_Info_Comments GLEAN3_28149 ###
multiple ANK domains, one SAM and a PARP domain

looks like Tankyrase - ADP ribosylase reportedly involved in modifying telomere-associated proteins and regulating GLUT4 traffic in the Golgi
###Gene_Info_Comments GLEAN3_07551 ###
9 ankyrin repeats and a single SAM domain

Too many repeats to be exactly SANS - unless that actually has more - but definitely a homolog
###Gene_Info_Comments GLEAN3_22872 ###
four ankyrin domains and a single SAM - looks like a pretty good match to SANS - has one more ankyrin repeat

SAM is the USH1G_HUMAN Usher syndrome type 1G protein (Scaffold protein containing ankyrin repeats and SAM domain)
###Gene_Info_Comments GLEAN3_25058 ###
has 4 ankyrin repeats, two SAM domains and a PTB domain

This gene organization is found in vertebrates usually with 5 or 6, ankyrin repeats (also in bees).

Names assigned to those genes include cajalin (rat) and odin (human)
Also known as EB-1 and ANKS1B
###Gene_Info_Comments GLEAN3_14167 ###
large protein with N-terminal ANK/SH3-----SAM-SAM and then a long run of undistinguished sequence.

This structure exists in chordates - named caskin-1 (CASK-interacting protein) in humans.

Note - Shank, a PSD-organizing protein involved in dendritic spine organization, is similar but has PDZ domain as well.
###Gene_Info_Comments GLEAN3_00935 ###
This is the 3' end of the gene.  The 5' end is GLEAN3_13520
###Gene_Info_Comments GLEAN3_13520 ###
FARP matches approximately the first 250aa. This is the 5' end of the gene.. the 3'end appears to be GLEAN3_00935
###Gene_Info_Comments GLEAN3_07771 ###
Duplication of GLEAN3_08684.  Looks like a splice varient
###Gene_Info_Comments GLEAN3_11527 ###
This is the 3' end--GLEAN3_25443 contains the first half of the gene.
###Gene_Info_Comments GLEAN3_10861 ###
PARTIAL Sequence, almost identical to middle of GLEAN 14908 ECT2
###Gene_Info_Comments GLEAN3_19189 ###
Best Blast is a Kalirin protein (a highly related family member), but based on sequence alignment and domain structure this is a TRIO not KALRN.  The TRIO Like 1 gene is in three parts.  5'end is GLEAN 22793.  The middle is 02796, while this GLEAN is the rest.
###Gene_Info_Comments GLEAN3_22793 ###
This is the 5'end of this gene.  The middle is GLEAN 02796, while the end is GLEAN 19189.
###Gene_Info_Comments GLEAN3_28457 ###
This is the end of ITSN2 gene.  The rest of the gene (5'end) is in GLEAN 03961.
###Gene_Info_Comments GLEAN3_03961 ###
This is the 5'end of this glean.  The rest is located in GLean 28457.
###Gene_Info_Comments GLEAN3_19587 ###
This gene is spread among 3 GLEANs.  This is the 5'end.  The middle is GLEAN 15117, and the 3' end is GLEAN 28316.
###Gene_Info_Comments GLEAN3_15117 ###
This gene is spread among 3 GLEANs.  This is the middle.  The 5'end is GLEAN 19587, and the 3' end is GLEAN 28316.
###Gene_Info_Comments GLEAN3_28316 ###
This gene is in three parts.  This is the 3'end.  The 5'end is GLEAN 19587, and the middle is GLEAN 15117.
###Gene_Info_Comments GLEAN3_14333 ###
Appears to be a splice varient of GLEAN3_02673
###Gene_Info_Comments GLEAN3_02796 ###
This is the middle of this gene.  The 5' end is GLEAN 22793, while most (and the 3'end) is 19189.
###Gene_Info_Comments GLEAN3_13479 ###
This is the middle and 3'end of this gene.  The 5' end is GLEAN3_18498
###Gene_Info_Comments GLEAN3_18498 ###
This is the 5'end of the gene.  The rest of the gene is found in GLEAN 013479
###Gene_Info_Comments GLEAN3_01797 ###
This model was annotated based on a manual inspection of multiple protein sequence alignments and domain structure.

This model belongs to a family of models that contain a Membrane attack/perforin domain (MACPF) and show similarity in domain structure to apextrin. The sufix C.2 is arbitrary and just reflects the clustering of this model with other Sp-MACPF genes in phylogenic trees.

Note that GLEAN3_01794 (Sp-MACPF-C.1), a model adjacent to this gene, also contains the MACPF domain. A comparison of their protein sequences reveals high similarity but a fair number of differences as well. It is to be determined whether this fact reflects the erroneous assembly of different haplotypes (both genes are indeed located in an area of numerous contigs) or if reflects a true gene duplication event.
###Gene_Info_Comments GLEAN3_02550 ###
This model was annotated based on a manual inspection of multiple protein sequence alignments and domain structure.

This model belongs to a family of models that contain a Membrane attack/perforin domain (MACPF) and show similarity in domain structure to apextrin. The sufix B.1 is arbitrary and just reflects the clustering of this model with other Sp-MACPF genes in phylogenic trees.

The modification to this model was made based on an otherwise identical Fgenesh++ prediction that includes a prediction for a signal peptide, a feature of all other Sp-MACPF genes, and an additional exon supported by the tiling array hybridization data.

The embryonic expression and modified structure of this model are strongly supported by the tiling array data.

NB: An adjacent model (GLEAN3_02548/9) is very similar in sequence and exon structure. However, there are noticeable differences between both. It is difficult at this point to determine whether this represents an assembly error or a true duplication/reversion event.
###Gene_Info_Comments GLEAN3_14984 ###
This model was annotated and modified based on a manual inspection of multiple protein sequence alignments and domain structure.

In their original form, most predictions for this model forced two C-ter exon into the structure of this gene due to the presence of a large NNNNNNNN gap (150+ kb) that likely includes the last exon of this gene. We have modified this model by deleting the last two exons and noting the incompleteness of this model.

This model belongs to a family of models that contain a Membrane attack/perforin domain (MACPF) and show similarity in domain structure to apextrin. The sufix A.3 is arbitrary and just reflects the clustering of this model with other Sp-MACPF genes in phylogenic trees.
###Gene_Info_Comments GLEAN3_15144 ###
This model was annotated based on a manual inspection of multiple protein sequence alignments and domain structure.

This model belongs to a family of models that contain a Membrane attack/perforin domain (MACPF) and show similarity in domain structure to apextrin. The sufix B.2 is arbitrary and just reflects the clustering of this model with other Sp-MACPF genes in phylogenic trees.

The embryonic expression and structure of this model are strongly supported by the tiling array data. Furthermore, there is a significant amount of signal coming from one of the introns. However, no other gene prediction protocol called for an exon in the region.

Note that this model includes a prediction for a transmembrane domain towards the C-ter of the protein, which is uncharacteristic of the other Sp-MACPF genes. However, since there is no alternative models for this region, we cannot rule out that this is a true feature of this gene. It should be noted, however, that this model is located at the end of a scaffold, and the exon call may have been forced.
###Gene_Info_Comments GLEAN3_22318 ###
This model was annotated based on a manual inspection of multiple protein sequence alignments and domain structure.

This model belongs to a family of models that contain a Membrane attack/perforin domain (MACPF) and show similarity in domain structure to apextrin. The sufix D.2 is arbitrary and just reflects the clustering of this model with other Sp-MACPF genes in phylogenic trees.
###Gene_Info_Comments GLEAN3_26119 ###
This model was annotated based on a manual inspection of multiple protein sequence alignments and domain structure.

This model belongs to a family of models that contain a Membrane attack/perforin domain (MACPF) and show similarity in domain structure to apextrin. The sufix D.3 is arbitrary and just reflects the clustering of this model with other Sp-MACPF genes in phylogenic trees.
###Gene_Info_Comments GLEAN3_27405 ###
This model was annotated based on a manual inspection of multiple protein sequence alignments and domain structure.

This model belongs to a family of models that contain a Membrane attack/perforin domain (MACPF) and show similarity in domain structure to apextrin. The sufix D.4 is arbitrary and just reflects the clustering of this model with other Sp-MACPF genes in phylogenic trees.

Note that this model is likely incomplete. Based on the position of this model close to an end of a scaffold, and the structure of other closely related models and Sp-MACPF genes in general, it is likely that there is missing N-ter sequence (including a signal peptide).

The embryonic expression and structure of this model are partly supported by the tiling array data.
###Gene_Info_Comments GLEAN3_28756 ###
This model was annotated based on a manual inspection of multiple protein sequence alignments and domain structure.

This model belongs to a family of models that contain a Membrane attack/perforin domain (MACPF) and show similarity in domain structure to apextrin. The sufix E.2 is arbitrary and just reflects the clustering of this model with other Sp-MACPF genes in phylogenic trees.
###Gene_Info_Comments GLEAN3_22230 ###
This model was annotated based on a manual inspection of multiple protein sequence alignments and domain structure.

This model belongs to a family of models that contain a Membrane attack/perforin domain (MACPF) and show similarity in domain structure to apextrin. The sufix E.3 is arbitrary and just reflects the clustering of this model with other Sp-MACPF genes in phylogenic trees.
###Gene_Info_Comments GLEAN3_14229 ###
This model was annotated based on a manual inspection of multiple protein sequence alignments and domain structure.

This model belongs to a family of models that contain a Membrane attack/perforin domain (MACPF) and show similarity in domain structure to apextrin. The sufix F.1 is arbitrary and just reflects the clustering of this model with other Sp-MACPF genes in phylogenic trees.

Note that most other Sp-MACPF genes include a signal peptide. However, none of the gene prediction protocols found one for this model. This is unlikely due to lack of scaffold sequence.
###Gene_Info_Comments GLEAN3_02921 ###
Very similar to GLEAN3_02923; looks like local duplication.
###Gene_Info_Comments GLEAN3_02923 ###
Very similar to GLEAN3_02921; looks like local duplication.
###Gene_Info_Comments GLEAN3_27596 ###
Very similar but not identical to GLEAN3_06866.
###Gene_Info_Comments GLEAN3_13704 ###
duplicate of glean3_01739
###Gene_Info_Comments GLEAN3_00386 ###
duplicate of glean3_12252
###Gene_Info_Comments GLEAN3_08485 ###
This model was annotated based on a manual inspection of multiple protein sequence alignments and domain structure.

This model belongs to a family of models that contain a Membrane attack/perforin domain (MACPF) and show similarity in domain structure to apextrin. The sufix E.4 is arbitrary and just reflects the clustering of this model with other Sp-MACPF genes in phylogenic trees.
###Gene_Info_Comments GLEAN3_02549 ###
This model was annotated and modified based on a manual inspection of multiple protein sequence alignments and domain structure.

The modification to this model was made based on an NCBI model that spans and perfectly matches this and an adjacent glean model (GLEAN3_02548). The modified nucleotide and protein sequences are provided for each of the fused glean models; but the gene features of only this model have been modified to reflect the fusion.

The predicted domain structure for the modified model includes only a signal peptide and the MACPF domain, a feature of all other Sp-MACPF genes.

This model belongs to a family of models that contain a Membrane attack/perforin domain (MACPF) and show similarity in domain structure to apextrin. The sufix B.0 is arbitrary and just reflects the clustering of this model with other Sp-MACPF genes in phylogenic trees.

The embryonic expression and modified structure of this model are strongly supported by the tiling array data.

NB: An adjacent model (GLEAN3_02550) is very similar in sequence and exon structure. However, there are noticeable differences between both. It is difficult at this point to determine whether this represents an assembly error or a true duplication/reversion event.
###Gene_Info_Comments GLEAN3_02548 ###
This model was annotated and modified based on a manual inspection of multiple protein sequence alignments and domain structure.

The modification to this model was made based on an NCBI model that spans and perfectly matches this and an adjacent glean model (GLEAN3_02549). The modified nucleotide and protein sequences are provided for each of the fused glean models; but the gene features of only GLEAN3_02549 have been modified to reflect the fusion. The gene features for this model have been accepted in their present form for simplicity. Please refer to GLEAN3_02549 for the modified exon structure.

The predicted domain structure for the modified model includes only a signal peptide and the MACPF domain, a feature of all other Sp-MACPF genes.

This model belongs to a family of models that contain a Membrane attack/perforin domain (MACPF) and show similarity in domain structure to apextrin. The sufix B.0 is arbitrary and just reflects the clustering of this model with other Sp-MACPF genes in phylogenic trees.

The embryonic expression and modified structure of this model are strongly supported by the tiling array data.

NB: An adjacent model (GLEAN3_02550) is very similar in sequence and exon structure. However, there are noticeable differences between both. It is difficult at this point to determine whether this represents an assembly error or a true duplication/reversion event.
###Gene_Info_Comments GLEAN3_21313 ###
The lim domains are in GLEAN3_04021, located in scaffold5
###Gene_Info_Comments GLEAN3_22817 ###
duplicate of glean3_25302.  See that model for data.
###Gene_Info_Comments GLEAN3_02246 ###
Tiling data suggests several exons are wrong.
###Gene_Info_Comments GLEAN3_02766 ###
no embryonic expression based on tiling
###Gene_Info_Comments GLEAN3_12277 ###
This is the PH domain that corresponds to the rest of the Sp-Tec protein (glean3_12278).
###Gene_Info_Comments GLEAN3_09691 ###
PLC eta is distributed over four GLEAN3 predictions (GLEAN3_09688-09691).  Complete annotation is located on GLEAN3_09691 THIS ONE HERE!!!!
###Gene_Info_Comments GLEAN3_09690 ###
Protein sequence continued on glean3_09691 glean3_09689 glean3_09688.  This annotation contains the PH domain of PLC eta.

Complete annotation is located on GLEAN3_09691
###Gene_Info_Comments GLEAN3_09689 ###
PLC eta is distributed over four GLEAN3 predictions (GLEAN3_09688-09691).  Complete annotation is located on GLEAN3_09691
###Gene_Info_Comments GLEAN3_09688 ###
PLC eta is distributed over four GLEAN3 predictions (GLEAN3_09688-09691).  Complete annotation is located on GLEAN3_09691
###Gene_Info_Comments GLEAN3_02630 ###
partial duplicate of glean3_21309.  Homeodomain and C-terminal are identical to Sp-Hox8. 
###Gene_Info_Comments GLEAN3_16561 ###
High similarity with human CDKL5. Partial N-terminal sequence
###Gene_Info_Comments GLEAN3_22670 ###
High similarity with human CDKL5. Partial N-terminal sequence
###Gene_Info_Comments GLEAN3_03424 ###
serine-threonine, not tyrosine
###Gene_Info_Comments GLEAN3_03389 ###
containa an olfactomedin domain - no collagen repeats or other predicted domains.
###Gene_Info_Comments GLEAN3_24073 ###
LY x2-EGF x5-SEA-EGF - no obvious TM domain

SEA/EGF COMBINATION IS CHARACTERISTIC OF MUCINS
###Gene_Info_Comments GLEAN3_25178 ###
LY-EGF x6-SEA-EGF - no obvious TM domain

SEA/EGF COMBINATION IS CHARACTERISTIC OF MUCINS
###Gene_Info_Comments GLEAN3_10472 ###
CCP x4 - SEA - EGF - apparent transmembrane domain

SEA/EGF COMBINATION IS CHARACTERISTIC OF MUCINS
###Gene_Info_Comments GLEAN3_03285 ###
SEA-EGF - apparent TM domain

SEA/EGF COMBINATION IS CHARACTERISTIC OF MUCINS
###Gene_Info_Comments GLEAN3_27741 ###
large protein with EGF/SEA/EGF flanked on both sides by long low complexity sequence blocks

SEA/EGF COMBINATION IS CHARACTERISTIC OF MUCINS
###Gene_Info_Comments GLEAN3_06884 ###
SEA/EGF - apparent TM domain
SEA/EGF COMBINATION IS CHARACTERISTIC OF MUCINS
###Gene_Info_Comments GLEAN3_05135 ###
single SEA domain - characteristic of mucins
###Gene_Info_Comments GLEAN3_20148 ###
sequence identical to GLEAN3_08768
###Gene_Info_Comments GLEAN3_01134 ###
Exact same sequence that GLEAN3_05105
###Gene_Info_Comments GLEAN3_27666 ###
Likely orthologue of Aurora A;Partially included in GLEAN3_27833 
###Gene_Info_Comments GLEAN3_06082 ###
C-terminal part of GLEAN3_00964
###Gene_Info_Comments GLEAN3_27384 ###
small part of GLEAN3_00964
###Gene_Info_Comments GLEAN3_14001 ###
part of GLEAN3_05312
###Gene_Info_Comments GLEAN3_00852 ###
Domains: Signal peptide, DEATH, NACHT, LRRs.
###Gene_Info_Comments GLEAN3_03186 ###
Domains: DEATH, NACHT, LRRs.
Fgenesh model has 6 additional small exons at the C-terminus, and contains additional LRRs.
###Gene_Info_Comments GLEAN3_17245 ###
Domains: Signal peptide, DEATH, NACHT, LRRs.
Fgenesh model has 6 additional c-terminal exons, the first 4 of which contain additional LRRs.
###Gene_Info_Comments GLEAN3_14503 ###
Domains: Signal peptide, DEATH, NACHT, LRRs.

###Gene_Info_Comments GLEAN3_28650 ###
Probable homologue of human ficolin 2.
###Gene_Info_Comments GLEAN3_28805 ###
Domains: Signal peptide, DEATH, NACHT, LRRs.
###Gene_Info_Comments GLEAN3_08597 ###
Domains: Signal peptide, NACHT, LRRs.
The Genscan model has a DEATH domain and no signal peptide, and may be more accurate.
###Gene_Info_Comments GLEAN3_19700 ###
Domains: DEATH, NACHT, LRRs.
###Gene_Info_Comments GLEAN3_04096 ###
IDENTICAL TO GLEAN3_04496
NOT Embryonically expressed.  May be pseudo-gene or adult-only gene.
###Gene_Info_Comments GLEAN3_01444 ###
Domains: Signal peptide, DEATH, NACHT, LRRs.
The Fgenesh model has 2 additional small exons, which may contain additional LRRs not detected by the SMART program.
###Gene_Info_Comments GLEAN3_23120 ###
Domains: NACHT, LRRs.
The Genscan model is probably incomplete since it doesn't start with a Met, but it has an additional 5' exon and contains a DEATH domain.
###Gene_Info_Comments GLEAN3_28483 ###
Domains: Signal peptide, DEATH, NACHT, LRRs.
The Fgenesh model has slightly different exon/intron structure at the 3' end and could contain additional LRRs not detected by the SMART program.
###Gene_Info_Comments GLEAN3_00816 ###
Domains: Signal peptide, DEATH, NACHT, LRRs.

###Gene_Info_Comments GLEAN3_19699 ###
Domains: Signal peptide, DEATH, NACHT, LRRs.

###Gene_Info_Comments GLEAN3_13952 ###
Domains: DEATH, NACHT, LRRs.
###Gene_Info_Comments GLEAN3_04165 ###
Domains: Signal peptide, DEATH, NACHT, LRRs.
###Gene_Info_Comments GLEAN3_15481 ###
Domains: DEATH, NACHT, LRRs.
The Fgenesh model has 7 additional small 3' exons that code for additional LRRs.
###Gene_Info_Comments GLEAN3_06733 ###
Domains: NACHT, LRRs.
This gene model is at the end of a short scaffold and could be incomplete, missing the DEATH domain.
###Gene_Info_Comments GLEAN3_08283 ###
Domains: DEATH, NACHT, LRRs.
The Genscan model has 3 additional small 3' exons that code for additional LRRs.
###Gene_Info_Comments GLEAN3_13465 ###
Domains: DEATH, NACHT, LRRs.
###Gene_Info_Comments GLEAN3_25204 ###
Domains: DEATH, NACHT, LRRs.
###Gene_Info_Comments GLEAN3_00457 ###
Domains: NACHT, LRRs.
The Genscan model has a different exon/intron structure and has DEATH domains (one at N-terminus and one c-terminal to the NACHT domain). 
###Gene_Info_Comments GLEAN3_05993 ###
Domains: DEATH, NACHT, LRRs.
This gene model is at the end of a scaffold and could be missing LRRs.
###Gene_Info_Comments GLEAN3_06203 ###
Domains: DEATH, NACHT, LRRs.
The Genscan model has 7 additional small exons at the 3' end and has additional LRRs.

###Gene_Info_Comments GLEAN3_00015 ###
Domains: DEATH, NACHT, LRRs.
This gene model is on a short scaffold and could be incomplete, missing LRRs.
###Gene_Info_Comments GLEAN3_19696 ###
Domains: DEATH, NACHT, LRRs.
###Gene_Info_Comments GLEAN3_12523 ###
Domains: NACHT, LRRs.
This gene model is on a short scaffold and could be incomplete, missing the DEATH domain.
###Gene_Info_Comments GLEAN3_25948 ###
Domains: DEATH (2), NACHT, LRRs.
This gene model overlaps with Glean3_25952. The Genscan model has a different terminal exon and does not overlap the other gene model. 
###Gene_Info_Comments GLEAN3_07113 ###
Domains: NACHT, LRRs.
Glean3_07116 has low e-value DEATH domain.
###Gene_Info_Comments GLEAN3_22412 ###
Domains: Sulfatase, NACHT, LRRs.
The Fgenesh model also has a DEATH domain before the NACHT domain. This is a very unusual domain combination.
Also, there is a very large exon between the exons coding for the sulfatase domain and the NACHT and LRRs.
###Gene_Info_Comments GLEAN3_21844 ###
Domains: NACHT, LRRs.
This gene model is on a short scaffold and could be incomplete, missing the DEATH domain.
###Gene_Info_Comments GLEAN3_11441 ###
Domains: NACHT, LRRs.
The Fgenesh model combines Glean3_11441 and Glean3_11440. It has a DEATH domain.
###Gene_Info_Comments GLEAN3_10053 ###
Domains: NACHT, LRRs.
The Genscan model combines Glean3_10053 and Glean3_10052, which contains the DEATH domain.
###Gene_Info_Comments GLEAN3_11439 ###
Domains: NACHT, DEATH, LRRs.

###Gene_Info_Comments GLEAN3_06229 ###
Domains: Signal peptide, DEATH, NACHT, LRRs.
This gene model is on a small scaffold and could be incomplete, missing LRRs.
###Gene_Info_Comments GLEAN3_02272 ###
Domains: NACHT, LRRs. 
This gene model could be incomplete, missing the DEATH domain(s).
###Gene_Info_Comments GLEAN3_10097 ###
Domains: DEATH, NACHT, LRRs.
###Gene_Info_Comments GLEAN3_28630 ###
Domains: DEATH, NACHT.
Glean3_28631 contains 4 exons coding for LRRs. The accurate gene model should probably be a fusion of these 2 gene models, such as the Genscan.
###Gene_Info_Comments GLEAN3_05301 ###
Domains: DEATH, NACHT, LRRs.
###Gene_Info_Comments GLEAN3_15052 ###
Domains: DEATH, NACHT, LRRs.
This gene model sits at the end of a scaffold and could be incomplete, missing LRRs.
###Gene_Info_Comments GLEAN3_24605 ###
9 armadillo repeats - matches best with plakoglobin/gamma catenin but also, less well, with beta catenin and poorly with p120/delta catenin
###Gene_Info_Comments GLEAN3_20240 ###
Domains: NACHT, LRRs.
This gene model is likely incomplete. There is an Fgenesh model just 5' of this glean that codes for 2 DEATH domains.
###Gene_Info_Comments GLEAN3_14761 ###
Domains: DEATH, DEATH, NACHT, LRRs.
The Fgenesh model has 2 additional exons that code for LRRs. This gene model is at the end of a scaffold and could be incomplete, missing LRRs.
###Gene_Info_Comments GLEAN3_08833 ###
Domains: Signal peptide, DEATH, NACHT, LRRs.
This gene model is on a small scaffold and could be incomplete, missing LRRs and/or a second DEATH domain.
###Gene_Info_Comments GLEAN3_01016 ###
Domains: NACHT, LRRs. 
The Genscan models has 2 additional exons (Glean3_01015) that code for a DEATH domain. The gene features were modified to reflect this model.
The Fgenesh model has 2 additional 3' exons that code for additional LRRs.
###Gene_Info_Comments GLEAN3_20414 ###
The KRP170 gene spans GLEAN3_20414 (Scaffold1612/Scaffoldi17703) and GLEAN3_26503 (Scaffold56862/Scaffoldi4507).  The mRNA was published as Chui,K.K., Rogers,G.C., Kashina,A.M., Wedaman,K.P., Sharp,D.J., Nguyen,D.T., Wilt,F. and Scholey,J.M.  "Roles of two homotetrameric kinesins in sea urchin embryonic cell division."   J. Biol. Chem. 275 (48), 38005-38011 (2000).  The GenBank entry for this gene is gi|10697491|gb|AF292395.2|.
Annotation by RA Obar, RL Morris, AL Silverio, BJ Chick, EJ Jin.
###Gene_Info_Comments GLEAN3_16794 ###
Domains: NACHT, LRRs.
This gene model is at the end of scaffold and is likely incomplete, missing the DEATH domain.
###Gene_Info_Comments GLEAN3_28595 ###
Domains: Signal peptide, DEATH, NACHT, LRRs.

###Gene_Info_Comments GLEAN3_03640 ###
Domains: DEATH, NACHT, LRRs.
The Fgenesh model has additional exons that could code for more LRRs.
###Gene_Info_Comments GLEAN3_03762 ###
Domains: DEATH, NACHT, LRRs.
###Gene_Info_Comments GLEAN3_03366 ###
Domains: DEATH, NACHT, LRRs.
The Fgenesh model just downstream has 5 exons that code for LRRs.
###Gene_Info_Comments GLEAN3_27511 ###
Domains: DEATH, NACHT, LRRs.
The Genscan model has 3 additional exons at the 3' end that code for additional LRRs.
###Gene_Info_Comments GLEAN3_18384 ###
Domains: DEATH, NACHT, LRRs.
The Fgenesh model is slightly different and codes for additional LRRs.

###Gene_Info_Comments GLEAN3_26189 ###
Domains: NACHT, LRRs.
The Genscan model is slightly different and also encodes a DEATH domain. It is likely incomplete however, since this is a short scaffold.
###Gene_Info_Comments GLEAN3_28060 ###
Domains: Signal peptide, DEATH, NACHT, LRRs.
This gene model could be incomplete since it is on a small scaffold.
###Gene_Info_Comments GLEAN3_25179 ###
Domains: DEATH, NACHT, LRRs.
The Genscan model has addtional small exons that code for additional LRRs.
###Gene_Info_Comments GLEAN3_05026 ###
Domains: DEATH, DED, NACHT,LRRs.
###Gene_Info_Comments GLEAN3_09160 ###
Domains: DEATH.
The Genscan model combines Glean3_09060 and Glean3_09061. Glean3_09061 has NACHT and LRRs. This gene model has been modified to reflect this gene model.
###Gene_Info_Comments GLEAN3_09161 ###
This gene model is combined with Glean3_09160. Please refer to this model for details.
###Gene_Info_Comments GLEAN3_28631 ###
This gene model is combined with Glean3_28631. Please refer to this gene model for details.
###Gene_Info_Comments GLEAN3_25166 ###
Domains: DEATH, NACHT, LRRs.
###Gene_Info_Comments GLEAN3_09659 ###
Domains: DEATH, NACHT, LRRs.
###Gene_Info_Comments GLEAN3_05609 ###
Domains: DEATH, NACHT, LRRs.
###Gene_Info_Comments GLEAN3_28433 ###
Domains: Signal peptide, DEATH, NACHT, LRRs.
###Gene_Info_Comments GLEAN3_11855 ###
Domains: DEATH, NACHT, LRRs.
###Gene_Info_Comments GLEAN3_17505 ###
Domains: DEATH, NACHT, LRRs.
The Glean3_17506 has additional LRRs that are probably part of this gene model.
###Gene_Info_Comments GLEAN3_17506 ###
This gene model contains LRRs only and probably belong to Glean3_17505. See this other gene model for details.
###Gene_Info_Comments GLEAN3_17196 ###
Domains: DEATH, NACHT, LRRs.
###Gene_Info_Comments GLEAN3_03247 ###
Domains: DEATH, NACHT, LRRs.
Both the Genscan and the Fgenesh models have different exon/intron structures in the 3' end of the gene, which code for additional LRRs.
###Gene_Info_Comments GLEAN3_08498 ###
Domains: NACHT, LRRs.
This gene model is on a small scaffold and could be incomplete, missing the DEATH domain(s).
###Gene_Info_Comments GLEAN3_27610 ###
Domains: DEATH, NACHT, LRRs.
###Gene_Info_Comments GLEAN3_03797 ###
Domains: DEATH, NACHT, LRRs.
###Gene_Info_Comments GLEAN3_22001 ###
Domains: DEATH, NACHT, LRRs.
###Gene_Info_Comments GLEAN3_28820 ###
Domains: NACHT, LRRs.
This gene model is on a short scaffold and could be incomplete, missing the DEATH domain(s).
###Gene_Info_Comments GLEAN3_16810 ###
Domains: DEATH, NACHT, LRRs.
###Gene_Info_Comments GLEAN3_26622 ###
Domains: NACHT, LRRs.
This gene model is on a small scaffold and could be incomplete, missing DEATH domain(s).

###Gene_Info_Comments GLEAN3_05383 ###
Domains: DEATH, NACHT, LRRs.
###Gene_Info_Comments GLEAN3_00318 ###
Gene neighbour of the ParaHox cluster; in chordates.
###Gene_Info_Comments GLEAN3_23183 ###
Domains: NACHT, LRRs.
The Genscan model has a different exon/intron sturcture and also codes for a DEATH domain N-terminal of the NACHT domain.

###Gene_Info_Comments GLEAN3_16257 ###
Domains: DEATH, NACHT, LRRs.
This gene model has a final intron of more than 150kb. This is very unusual. The Fgenesh model has a different final exon, which is much closer to the rest of the gene, and could be a more accurate model.
###Gene_Info_Comments GLEAN3_24419 ###
EST data: 

BG784137 SEAUMC004094 Sea urchin primary mesenchyme cell cDNA library Strongylocentrotus purpuratus cDNA clone PC_0020_A1_G10_MR 5', mRNA sequence

>gi|57955070|gb|CX692889.1|CX692889 yde99f06.y1 Sea urchin EST Lib1 Strongylocentrotus purpuratus cDNA clone yde99f06 5' similar to TR:Q9VK69 Q9VK69 CG5525 PROTEIN. ;, mRNA sequence
###Gene_Info_Comments GLEAN3_16926 ###
Domains: DEATH, NACHT, LRRs.
###Gene_Info_Comments GLEAN3_01210 ###
Domains: DEATH, NACHT, LRRs.
###Gene_Info_Comments GLEAN3_04343 ###
Domains: DEATH, NACHT, LRRs.

###Gene_Info_Comments GLEAN3_14122 ###
Domains: DEATH, NACHT, LRRs.
###Gene_Info_Comments GLEAN3_25600 ###
Domains: DEATH, NACHT, LRRs.
The Genscan model has additional small exons that code for more LRRs.
###Gene_Info_Comments GLEAN3_00523 ###
Domains: DEATH, NACHT, LRRs.
This gene model is on a short scaffold and could be incomplete, missing LRRs.
###Gene_Info_Comments GLEAN3_08707 ###
Domains: DEATH, NACHT, LRRs.
The Genscan model has additional small exons that code for additional LRRs.
###Gene_Info_Comments GLEAN3_07446 ###
Domains: DEATH, NACHT, LRRs.
This gene model is on a short scaffols and could be incomplete, missing LRRs.
###Gene_Info_Comments GLEAN3_18173 ###
7 ARMADILLO REPEATS - similar to karyopherin (importin) alpha 6  - clusters with human importins
###Gene_Info_Comments GLEAN3_13477 ###
8 ARMADILLO REPEATS - similar to karyopherin (importin) alpha 6  - clusters with human importins
###Gene_Info_Comments GLEAN3_04456 ###
single armadillo repeat - clusters with importins 
###Gene_Info_Comments GLEAN3_24721 ###
This is likely to be split between this gene and glean 24930. Correct protein sequence is predicted to be:
MAQAYVEDTLMVRHELKHAYGKVFLVRKVGNHNQGKLYAMKVLKKATIVQ
KAKTAEHTMTERQVLEAVRSCPFLVTLHYAFQTDSKLNLILDYVNGGELF
THLYQREHFRESEVRIYIAEIIIALDCLHRILTSHPPMPNTFSKEVKDFI
NKLLVKDPTKRLGCNGVKDIKSHSFFKGLNWDDVAAKRVSPPFRPHINGE
LDTSNFAEEFTSLVPADSPADIPKTADARVFRVGYSFIAPSILYSDNAIT
QDMLTQPSEHNRPSLASILSIHELKDSPFNKYYELDMKSAPIGDGSFSIC
RRCTHRKTEKEYAVKIVSRRVACTQEITTLQLCQKHPNIVHLKEEFKDKL
HTYIIMELCKGGELLGRIRKKKHFDELEASMIMRKLVSAVDYMHSRGIVH
RDLKPENILFTDDSDDAELKIIDFGFARITNSNQPLKTPCFSLHFAAPEV
LKRAYEQDGEYDASCDVWSLGVILYTMLSGRVPFQDPSISKSNSASDIMK
RIKHGNFSFDGEEWNSVSTPAKDLIKGLLTVDPSRRLTTDDLLQNEWIQG
QQLSTSTPLMTPDILNSCASIQKRVKATMRAFHTAQREGFLLTDVSNAPL
AKRRKKKKDSSTETRSSSSESTHSQSSSSQESTTPTPTANPVLTIPVTTV
SCAPRTTTATGAPSIPSVQPLPSLSKQTGARLDQYESLESLGFSPILPFS
AGGSQELPPLLARQDSGYVGQMPSYAQVTPVPRTNVGSHGVTYAPILDPS
MYPCGLQQPILDFSSSIPEYLSVQYASTEQPSIPMTVPRTLHQPHPHPLP
LPHQHLSHLPTISEDPSTT

###Gene_Info_Comments GLEAN3_00512 ###
Predicts the carboxy-terminal end of a dual oxidase with an exon structure similar to that of Sp-Udx1 (GLEAN3_00513).  May link to GLEAN3_25507, which represents a paralogous amino-terminal domain to Sp-Udx1 (GLEAN3_00513).
###Gene_Info_Comments GLEAN3_25518 ###
Best hit to Xenopus Timeless, needs to be corelated with GLEAN3_06230
###Gene_Info_Comments GLEAN3_26036 ###
Domains: DEATH, DEATH, LRRs, NACHT, LRRs, ZU5.
The Fgenesh model has different exon/intron structure, is missing the last 2 exons, and therefore the ZU5 model and is likely more accurate, since this domain is not associated with this type of protein in any other organism. 

The presence of LRRs between the DEATH domains and the NACHT domain is also unusual.

###Gene_Info_Comments GLEAN3_00863 ###
Domains: DEATH, NACHT, LRRs.
The Genscan model fuses Glean3_00863 with Glean3_00864 (which is comprised of 4 small exons) that could code for additional LRRs.
###Gene_Info_Comments GLEAN3_20641 ###
Would prefer to use Arl13b (for clarity's sake), but ARL2L1 appears to the preferred naming scheme.
Also, note that tiling array shows signal on only 6/10 exons (missing signal at 5' exons).  No EST to confirm.  Likely that 5' end is wrong.
###Gene_Info_Comments GLEAN3_23642 ###
Domains: DEATH, NACHT, LRRs.
###Gene_Info_Comments GLEAN3_27808 ###
Domains: DEATH,NACHT, LRRs.
###Gene_Info_Comments GLEAN3_26400 ###
Domains: NACHT, LRRs.
This gene model is at the end of a scaffold and could be incomplete, missing the DEATH domain(s).
###Gene_Info_Comments GLEAN3_25167 ###
Domains: NACHT, LRRs.
The Fgenesh model has additional exons in the 5' region that code for a DEATH domain. However, this model is truncated at the 3' end and is missing a large portion of the LRRs. A combination of both models is likely the most accurate version of this gene.
###Gene_Info_Comments GLEAN3_11088 ###
Domains: NACHT, LRRs.
The Genscan model has additional exons that code for a DEATH domain. This gene model is on a short scaffold and could be incomplete, missing LRRs.
###Gene_Info_Comments GLEAN3_22070 ###
Domains: NACHT, LRRs.
This gene model is at the end of a scaffold and could be incomplete, missing the DEATH domain(s).
###Gene_Info_Comments GLEAN3_01884 ###
Domains: NACHT, LRRs.
The Genscan model has 2 additional exons that code for additional LRRs.
This gene model is on a short scaffols and could be incomplete, missing the DEATH domain(s).
###Gene_Info_Comments GLEAN3_15206 ###
Domains: DEATH, EGF, NACHT, LRRs.
The presence of the EGF domain in this model is unusual.

The Genscan model combines Glean3_15206 and Glean3_15207. Glean3_15207 has 4 exons that code for additional LRRs. The Genscan model is therefore likely to be more accurate.
###Gene_Info_Comments GLEAN3_15207 ###
This gene model contains LRRs only but is likely a part of Glean3_25206 model. Please see this other gene model for details.
###Gene_Info_Comments GLEAN3_11035 ###
Domains: DEATH, DEATH, NACHT, LRRs.
###Gene_Info_Comments GLEAN3_22441 ###
Domains: DEATH, NACHT, LRRs.
###Gene_Info_Comments GLEAN3_06456 ###
Domains: NACHT, LRRs.
This gene model is on a small scaffold and could be incomplete, missing the DEATH domain(s) and LRRs.
###Gene_Info_Comments GLEAN3_06587 ###
Domains: Signal peptide, NACHT, DEATH, LRRs.
This domain organization of this gene model is unusual, since the DEATH domain is usually N-terminal to the NACHT domain.
###Gene_Info_Comments GLEAN3_15205 ###
Domains: DEATH, EGF(2), NACHT, LRRs.
The presence of the EGF domains in this gene model is unusual but it is also seen in the gene model just downstream on the same scaffold: Glean3_15206.
###Gene_Info_Comments GLEAN3_01630 ###
Domains: NACHT, LRRs.
This gene model is located on a short scaffold and is likely incomplete, missing the DEATH domain(s).
###Gene_Info_Comments GLEAN3_12713 ###
Domains: DEATH, NACHT, LRRs.
###Gene_Info_Comments GLEAN3_01549 ###
Domains: NACHT, LRRs.
This gene model is on a short scaffold and is likely incomplete, missing the DEATH domain(s).
###Gene_Info_Comments GLEAN3_05732 ###
Domains: DEATH, NACHT, LRRs.
The Genscan model has three additional small exons that code for additional LRRs.
###Gene_Info_Comments GLEAN3_27300 ###
Domains: DEATH, EGF, NACHT, LRRs.
The presence of the EGF domain in this gene model is unusual.

The Fgenesh model has 2 additional terminal exons that code for additional LRRs.
This gene is at the end of a scaffold and could be incomplete, missing LRRs.

###Gene_Info_Comments GLEAN3_20569 ###
Domains: DEATH, EGF, NACHT, LRRs.
The presence of the EGF domain in this gene model is unusual.

###Gene_Info_Comments GLEAN3_02888 ###
Domains: NACHT, LRRs.
This gene model is on a short scaffold and is likely incomplete, missing DEATH domain(s) and LRRs.
###Gene_Info_Comments GLEAN3_24709 ###
Domains: DEATH, NACHT, LRRs.
###Gene_Info_Comments GLEAN3_10052 ###
Domains: DEATH.
This gene model is likely part of a larger model that includes Glean3_10053. See this model for further details.
###Gene_Info_Comments GLEAN3_00738 ###
Domains: NACHT, LRRs.
This gene model is on a short scaffold and could be incomplete, missing the DEATH domain(s).
###Gene_Info_Comments GLEAN3_03303 ###
Domains: NACHT, LRRs.
The Genscan model has 3 additional exons at the 5' end that code for a DEATH domain. This gene model is likely to be more accurate than the glean.

This gene model is on a short scaffold and could be incomplete, missing a DEATH domain and/or LRRs.
###Gene_Info_Comments GLEAN3_10091 ###
Domains: DEATH, NACHT, LRRs.
The DEATH and NACHT domains overlap. The Genscan model has an additional small exon before the one coding for the NACHT domain and effectively "separates" the DEATH and NACHT domains. However, the last 2 exons of the Genscan model (not in the Glean model) code for a Zn finger domain and probably do not belong to this protein. The accurate gene model is likely part of the Genscan model and the Glean model.
###Gene_Info_Comments GLEAN3_26320 ###
Domains: DEATH, NACHT, LRRs.
###Gene_Info_Comments GLEAN3_14719 ###
Domains: NACHT, LRRs.
This gene model is located on a short scaffold and is likely incomplete, missing DEATH domain(s) and/or LRRs.
###Gene_Info_Comments GLEAN3_21630 ###
two ANATO domains and nothing else - a "chordate-specific" domain found in complement C3/4/5 and in fibulins.  This looks like a fragment - probably of a fibulin - could be haplotype pair for GLEAN3_26629, which encodes a fibulin.

###Gene_Info_Comments GLEAN3_16532 ###
Domains: LRRs.
The Fgenesh model has additional exons that code for Signal peptide, DEATH and NACHT domains. The gene features and sequences of this gene model were modified to reflect this.
###Gene_Info_Comments GLEAN3_27053 ###
Duplication of the C-term of glean3_02874.
###Gene_Info_Comments GLEAN3_15768 ###
Domains: NACHT, LRRs. 
This gene model is on a short scaffold and could be incomplete, missing the DEATH domain(s).
###Gene_Info_Comments GLEAN3_26304 ###
Domains: NACHT, DEATH, LRRs.
The presence of a DEATH domain C-terminal to the NACHT domain is unexpected since this domain structure is not observed in the vertebrate NLRs, where PYD and CARD domains are N-terminal to the NACHT domain.

The Fgenesh model (S.P_Scaffold1348.seq.N000001:  2,207-2,237) located 5' of this Glean model codes for a DEATH domain that likely belongs to this gene.
###Gene_Info_Comments GLEAN3_01608 ###
Domains: NACHT, LRRs.
This gene model is on a short scaffold and could be incomplete, missing the DEATH domain(s).

###Gene_Info_Comments GLEAN3_28485 ###
Domains: NACHT, LRRs.
This gene model is on a short scaffold and could be incomplete, missing the DEATH domain(s).

###Gene_Info_Comments GLEAN3_06019 ###
Domains: NACHT, LRRs.
This gene model is on a short scaffold and could be incomplete, missing the DEATH domain(s).

###Gene_Info_Comments GLEAN3_05410 ###
Domains: NACHT, LRRs.
This gene model is on a short scaffold and could be incomplete, missing the DEATH domain(s) and/or LRRs.
###Gene_Info_Comments GLEAN3_19497 ###
Domains: NACHT, LRRs.
This gene model is on a short scaffold and could be incomplete, missing the DEATH domain(s) and/or LRRs.
###Gene_Info_Comments GLEAN3_05581 ###
Domains: DEATH, NACHT, LRRs, NACHT, LRRs.
This structure is very unusual, since most NLRs have a single NACHT domain. There is a large (~100kb) intron in this prediction. Therefore, this could be a fusion of 2 gene models.
###Gene_Info_Comments GLEAN3_21370 ###
Domains: NACHT, LRRs.
This gene model is on a short scaffold and could be incomplete, missing the DEATH domain(s) and/or LRRs.
###Gene_Info_Comments GLEAN3_07620 ###
Domains: NACHT, LRRs.
This gene model is on a small scaffold and could be incomplete, missing the DEATH domain(s) and/or LRRs.
###Gene_Info_Comments GLEAN3_23759 ###
Domains: LRRs.
This gene model contains part of a NACHT domain as well. Glean3_23758 has the N-terminal part of the NACHT domain as well as a Signal peptide and DEATH domain. The accurate gene model is probably a fusion of the two, similar to the Genscan prediction. The Genscan model is missing a part of the NACHT domain.
###Gene_Info_Comments GLEAN3_23758 ###
Domains: Signal peptide, DEATH.
This gene model probably belongs as a fusion with GLEAN3_23759. Please see this other model for details.
###Gene_Info_Comments GLEAN3_10667 ###
Domains: NACHT, LRRs.
This gene model is on a short scaffold and could be incomplete, missing the DEATH domain(s) and/or LRRs.
###Gene_Info_Comments GLEAN3_27207 ###
Domains: NACHT, LRRs.
This gene is located on a short scaffold and could be incomplete, missing the DEATH domain(s) and/or LRRs.
###Gene_Info_Comments GLEAN3_22564 ###
Domains: NACHT, LRRs.
The Fgenes model has three additional exons at the 5'end which code for a DEATH domain. This model is probably a more accurate reflection of the actual gene.
###Gene_Info_Comments GLEAN3_16060 ###
Domains: NACHT, LRRs.
This gene model is on a short scaffold and could be incomplete, missing the DEATH domain(s) and/or LRRs.
###Gene_Info_Comments GLEAN3_13038 ###
Domains: NACHT, LRRs.
This gene model is at the end of a scaffold and could be incomplete, missing the DEATH domain(s).
###Gene_Info_Comments GLEAN3_27035 ###
Domains: NACHT, LRRs.
This gene model is on a short scaffold and could be incomplete, missing the DEATH domain(s) and LRRs.
###Gene_Info_Comments GLEAN3_15033 ###
Domains: NACHT, LRRs.
This gene model is on a short scaffold and could be incomplete, missing the DEATH domain(s).
###Gene_Info_Comments GLEAN3_11097 ###
Domains: NACHT, LRRs.
This gene mode is located at the end of a scaffold and could be incomplete, missing the DEATH domain(s).
###Gene_Info_Comments GLEAN3_21930 ###
Domains: DEATH, NACHT, LRRs.
###Gene_Info_Comments GLEAN3_06219 ###
Domains: DEATH, LRRs.
Both the Genscan and the Fgenesh models have a different exon/intron structure and code for a NACHT domain. The Genscan NACHT domain is better (higher e-value). However, this model may not be the most accuarate one, since it is does not start with a Methionine.
The gene features and sequences were changed to reflect the 5'end of the Glean3 model and the NACHT domain region of the Genscan model. 
(The gene feature edit function was not available, so an additional "exon" was created to make the existing one bigger.)
###Gene_Info_Comments GLEAN3_10039 ###
Domains: NACHT, LRRs, CLECT, CLECT.
This gene model is likely a fusion of two models, as the Fgenesh model only has the first 4 exons and doesn't code for the CLECT domain, which are not normally found in this type of protein. Also, the intron separating the NLR domains from the CLECT domains is very large (~150kb) which is very unusual.
This gene model is at the beginning of a scaffold and could be incomplete, missing the DEATH domain(s).
###Gene_Info_Comments GLEAN3_10116 ###
Could contain a repetitive non-coding sequence in the second predicted exon, without this sequence the predicted ORF is very similar in length and homology to human Dystroglycan
###Gene_Info_Comments GLEAN3_03619 ###
Domains: DEATH, 6x(EGF), NACHT.
The presence of the EGF domain in this model is unusual.

The LRRs are not present in this glean model but are included in the corresponding Genscan prediction. The genscan prediction is missing part of the NACHT domain, however. A hybrid of both gene predictions is probably the accurate model.
###Gene_Info_Comments GLEAN3_01423 ###
Domains: DEATH, NACHT, LRRs.
This gene model is located on a small scaffold and could be incomplete, missing LRRs.
Also, the DEATH and NACHT domains overlap in the SMART prediction. This overlap is less extensive in the Genscan model which has a larger 3rd exon. The gene features and sequences for this model reflect the Genscan model.
###Gene_Info_Comments GLEAN3_18582 ###
This GLEAN matches a GenBank entry, "gi|72180194|ref|XP_794270.1| similar to dynein 2 light intermediate chain" exactly.
###Gene_Info_Comments GLEAN3_22644 ###
No signal in tiling array or EST.  May be pseudogene or adult only expression.
###Gene_Info_Comments GLEAN3_09807 ###
NOTE: Based on sequence, alignment and domain analysis--This look like the 5' end of a single OPA gene that is composed of GLEANS 09807 (5'end)  and 06815 (3'end), with no gaps between.
###Gene_Info_Comments GLEAN3_06815 ###
NOTE: Based on sequence, alignment and domain analysis--This look like the 3' end of a single OPA gene that is composed of GLEANS 09807 (5'end)  and 06815 (3'end), with no gaps between.
###Gene_Info_Comments GLEAN3_09673 ###
NOTE: No signal in tiling array and no EST.  May be pseudogene or adult expressed.
###Gene_Info_Comments GLEAN3_09674 ###
NOTE: No signal in tiling array and no EST.  May be pseudogene or adult expressed.
###Gene_Info_Comments GLEAN3_16744 ###
NOTE: No signal in tiling array and no EST.  May be pseudogene or adult expressed.
###Gene_Info_Comments GLEAN3_10656 ###
high homology to chondroitin glucuronate 5-epimerase, dermatan sulfate epimerase, squamous cell carcinoma antigen recognized by T cells 2, SART2

NCAG1 may be an enzyme with dual epimerase and O-sulfotransferase activities involved in dermatan sulfate biosynthesis (Maccarana et al., in press, JBC 2006)
###Gene_Info_Comments GLEAN3_12840 ###
Better hit against Human Traf1 sequence than GLEAN3_26479
###Gene_Info_Comments GLEAN3_05419 ###
Kinase domain + some 3' UTR partially cloned out of egg cDNA
The model is probably missing the 5' most sequence based on comparisons to A.miniata SFK3

 Gleen for CDS and Peptide sequences Accepted, however further verification should be done for sequence that is not within the cloned mRNA sequence.
###Gene_Info_Comments GLEAN3_02160 ###
likely an incomplete version of GLEAN10659. Refer to GLEAN3_10659 for the complete sequence.
###Gene_Info_Comments GLEAN3_17675 ###
likely an incomplete prediction, the C-terminal region of the protein is not predicted.
###Gene_Info_Comments GLEAN3_07092 ###
This is the S.purpuratus version of gi|1817526|dbj|BAA09934.1| intermediate chain 1 [Anthocidaris crassispina].
###Gene_Info_Comments GLEAN3_19893 ###
Structure: CARD-DEATH.
The CARD domain is a poor hit in a SMART analysis, but does appear in the 2nd table.
###Gene_Info_Comments GLEAN3_06410 ###
Structure: CARD-DEATH-DEATH.
The CARD domain is a poor hit in a SMART analysis but does appear in the 2nd table.
###Gene_Info_Comments GLEAN3_19506 ###
This GLEAN represents the sea urchin Dynein Intermediate Chain 2, as defined by the Anthocidaris crassispina cDNA (gi|2494214|sp|Q16959|DYI2_ANTCR Dynein intermediate chain 2, ciliary).  GLEAN3_19506 originally represented only the first 477 amino acid residues, while GLEAN3_05973 represented the last 437 residues.  These were merged into GLEAN3_19506.
###Gene_Info_Comments GLEAN3_06354 ###
The prototype of this gene product is from the sea urchin Anthocidaris crassispina (gi|2760163|dbj|BAA24185.1| outer arm dynein light chain 1 [Anthocidaris crassispina]).
###Gene_Info_Comments GLEAN3_18854 ###
This GLEAN represents the sea urchin Dynein Light Chain 2, as defined by the Anthocidaris crassispina cDNA (gi|2760161|dbj|BAA24184.1| outer arm dynein light chain 2 [Anthocidaris crassispina]). 
###Gene_Info_Comments GLEAN3_08471 ###
This GLEAN represents the sea urchin Outer Arm Dynein Light Chain 2, as defined by the Anthocidaris crassispina cDNA (gi|3336986|dbj|BAA31751.1| outer arm dynein LC3 [Anthocidaris crassispina]), and is identical to the GenBank RefSeq entry "XP_783725.1 PREDICTED: similar to t-complex testis expressed 1."
###Gene_Info_Comments GLEAN3_04009 ###
This GLEAN represents the sea urchin Dynein Light Chain Type 6, as defined by the Anthocidaris crassispina cDNA (gi|2811014|sp|O02414|DYL1_ANTCR Dynein light chain LC6, flagellar outer arm), which has an identical amino acid sequence.   The S. purpuratus RefSeq ID is XP_785110.1.
There are several other GLEANs that are identical or nearly identical to this one:
GLEAN3_18607 (100% Amino Acid Identity)
GLEAN3_08799 (93% Amino acid Identity)                                           
GLEAN3_24498 (91% Amino acid Identity)                                           
GLEAN3_24497 (88% Amino acid Identity)                                           
GLEAN3_18567 (88% Amino acid Identity)                                           
GLEAN3_25272 (88% Amino acid Identity)                                           
GLEAN3_24499 (88% Amino acid Identity)                                           
GLEAN3_08800 (86% Amino acid Identity)                                           
GLEAN3_24500 (87% Amino acid Identity)                                           
GLEAN3_08801 (76% Amino acid Identity)                                          
GLEAN3_27937 (70% Amino acid Identity)
GLEAN3_27938 (68% Amino acid Identity)
###Gene_Info_Comments GLEAN3_18607 ###
This GLEAN represents the sea urchin Dynein Light Chain Type 6, as defined by the Anthocidaris crassispina cDNA (gi|2811014|sp|O02414|DYL1_ANTCR Dynein light chain LC6, flagellar outer arm), which has an identical amino acid sequence.   The S. purpuratus RefSeq ID is XP_785110.1.
This gene model is a member of a group of several GLEANs that are identical or nearly identical to GLEAN3_04009:
GLEAN3_18607 (100% Amino Acid Identity)
GLEAN3_08799 (93% Amino acid Identity)                                           
GLEAN3_24498 (91% Amino acid Identity)                                           
GLEAN3_24497 (88% Amino acid Identity)                                           
GLEAN3_18567 (88% Amino acid Identity)                                           
GLEAN3_25272 (88% Amino acid Identity)                                           
GLEAN3_24499 (88% Amino acid Identity)                                           
GLEAN3_08800 (86% Amino acid Identity)                                           
GLEAN3_24500 (87% Amino acid Identity)                                           
GLEAN3_08801 (76% Amino acid Identity)                                          
GLEAN3_27937 (70% Amino acid Identity)
GLEAN3_27938 (68% Amino acid Identity)
###Gene_Info_Comments GLEAN3_15529 ###
homologous to DNA-PKcs, 5'end of gene is most likely missing, Glean3_15528 also contains parts of this gene, 
Glean3_15484 contains 5'end / amino-terminus of gene
###Gene_Info_Comments GLEAN3_08799 ###
This GLEAN represents the sea urchin Dynein Light Chain Type 6, as defined by the Anthocidaris crassispina cDNA (gi|2811014|sp|O02414|DYL1_ANTCR Dynein light chain LC6, flagellar outer arm), which has an identical amino acid sequence.   The S. purpuratus RefSeq ID is XP_785110.1.
This gene model is a member of a group of several GLEANs that are identical or nearly identical to GLEAN3_04009:
GLEAN3_18607 (100% Amino Acid Identity)
GLEAN3_08799 (93% Amino acid Identity)                                           
GLEAN3_24498 (91% Amino acid Identity)                                           
GLEAN3_24497 (88% Amino acid Identity)                                           
GLEAN3_18567 (88% Amino acid Identity)                                           
GLEAN3_25272 (88% Amino acid Identity)                                           
GLEAN3_24499 (88% Amino acid Identity)                                           
GLEAN3_08800 (86% Amino acid Identity)                                           
GLEAN3_24500 (87% Amino acid Identity)                                           
GLEAN3_08801 (76% Amino acid Identity)                                          
GLEAN3_27937 (70% Amino acid Identity)
GLEAN3_27938 (68% Amino acid Identity)
The predicted amino acid sequence of GLEAN3_08799 is identical to those of GLEAN3_08800 and GLEAN3_08801.
Because there is a good chance that they do not represent three distinct gene products, they were named as follows:
GLEAN3_08799: Sp-Dynein Light Chain-1-3a
GLEAN3_08800: Sp-Dynein Light Chain-1-3b
GLEAN3_08801: Sp-Dynein Light Chain-1-3c
###Gene_Info_Comments GLEAN3_03242 ###
This GLEAN represents an Outer Arm Dynein Light Chain-like polypeptide (RefSeq ID: gi|72012233|ref|XP_782355.1| PREDICTED: similar to outer arm dynein light chain like (XJ558) [Strongylocentrotus purpuratus]).
###Gene_Info_Comments GLEAN3_04498 ###
This GLEAN represents an Outer Arm Dynein Light Chain-like polypeptide (RefSeq ID: gi|72070876|ref|XP_791620.1| PREDICTED: similar to outer arm dynein light chain like).
###Gene_Info_Comments GLEAN3_26533 ###
This gene model represents a segment of the axonemal Dynein intermediate Chain 3.  Similarity to Chlamydomonas reinhardtii Inner_Dynein_Arm_1_Intermediate Chain IC140 (C_530081|166736|IA1-IC140|) and Homo sapiens axonemal dynein intermediate polypeptide 2.  Contains a WD-40 motifs.  The protein is also essentially identical to the Anthocidaris crassispina (gi|2494216|sp|Q16960|DYI3_ANTCR Dynein intermediate chain 3, ciliary).
###Gene_Info_Comments GLEAN3_24517 ###
putative homolog of xrcc4, 3'end / C-terminus is missing in gene model
###Gene_Info_Comments GLEAN3_12239 ###
homolog of Cernunnos (protein) and Xlf (gene)
###Gene_Info_Comments GLEAN3_14075 ###
Annotated by RL Morris and B Rossetti.
###Gene_Info_Comments GLEAN3_06911 ###
This GLEAN corresponds to the first 292 amino acid residues (of a total of 686) of a well-characterized Microtubule-associated Protein known as the "77kDa-MAP."  GLEAN3_06911 overlaps with GLEAN3_05744 by 18 codons, followed by the remainder of the carboxyl-terminus of the protein (a total of 686 codons plus the stop codon) contained in GLEAN3_05744.  The annotation for the full-length gene product is associated with GLEAN3_06911.
###Gene_Info_Comments GLEAN3_05744 ###
This GLEAN corresponds to the last 402 amino acid residues (of a total of 686) of a well-characterized sea urchin Microtubule-associated Protein known as the "77kDa-MAP."  GLEAN3_06911 overlaps with GLEAN3_05744 by 18 codons, followed by the remainder of the carboxyl-terminus of the protein (a total of 686 codons plus the stop codon) contained in GLEAN3_05744.  The annotation for the full-length gene product is associated with GLEAN3_06911.
###Gene_Info_Comments GLEAN3_09444 ###
This GLEAN represents a homolog of the mammalian  Microtubule-associated proteins 1A/1B light chain 3B precursor (gi|72168709|ref|XP_783653.1| PREDICTED: similar to Microtubule-associated proteins 1A/1B light chain 3B precursor (MAP1A/MAP1B LC3) (MAP1A/1B light chain 3) [Strongylocentrotus purpuratus]).  There seem to be two MAP1A/1B_LC3-like proteins encoded in the S. purpuratus genome: GLEAN3_09444 (~72% sequence identity) and GLEAN3_08954 (~60% sequence identity).
###Gene_Info_Comments GLEAN3_22897 ###
This GLEAN is a relative of TPX2, microtubule-associated protein, RefSeq: "gi|72167299|ref|XP_796944.1| PREDICTED: similar to TPX2, microtubule-associated protein homolog [Strongylocentrotus purpuratus]."
###Gene_Info_Comments GLEAN3_08221 ###
The last 238 amino acid residues of this GLEAN are related to TPX2, microtubule-associated protein, RefSeq: "gi|72167299|ref|XP_796944.1| PREDICTED: similar to TPX2, microtubule-associated protein homolog [Strongylocentrotus purpuratus]."
###Gene_Info_Comments GLEAN3_28717 ###
Glean3_28716 is the front part of GLEAN3_28717.  Reported as separate genes but should be the same gene.  I've attached the GLEAN3_28716 sequences in front of the GLEAN3_28717 sequences.  
###Gene_Info_Comments GLEAN3_21172 ###
The two first predicted exons likely belong to a different gene called calumenin. 
###Gene_Info_Comments GLEAN3_10527 ###
sequence identical to GLEAN3_12840
###Gene_Info_Comments GLEAN3_04366 ###
Glean has falsely predicted the following exon: 3
###Gene_Info_Comments GLEAN3_07226 ###
eIF3e spans 2 glean prediction 07726 (N-ter) and 09248 (Cter), probably missing exons in between.
###Gene_Info_Comments GLEAN3_09248 ###
eIF3e spans 2 glean prediction 07726 (N-ter) and 09248 (C-ter), probably missing exons in between.
###Gene_Info_Comments GLEAN3_25856 ###
The prediction is likely incorrect regarding the first 160 aminoacids. 
###Gene_Info_Comments GLEAN3_00502 ###
homology with UPF3 from aa 1200 to 1588 AAG48511.1, 
aa 1-1200 similar to Retrovirus-related POL polyprotein (Endonuclease) AAH66867.1
###Gene_Info_Comments GLEAN3_05736 ###
This gene was annotated with A. pectinifera mRNA and peptide.  The complete gene is annotated in Glean3_12078.
Aligns with ApIP3R AA sequence from 1670-1769.
###Gene_Info_Comments GLEAN3_27674 ###
Annotation of this gene was done with alignments using A. pectinifera mRNA and pepetide.  The full annotation of this gene is on Glean3_12078.  Of note, there were two Glean3 predicted exons in sequential order which were exactly the same.  One was erased.  This glean aligns with ApIP3R AAs 2058-2673.
###Gene_Info_Comments GLEAN3_21143 ###
NCBI is calling this a DSP4, but the best non-predicted BLAST hit is a DSP1
###Gene_Info_Comments GLEAN3_02908 ###
59% identity with the corresponding region of human VARS2 (P26640) whereas it has only 46% identity with the human VARS2-like (NP_065175.3)

Sp-VARS isoformB has 47% identity with the Sp-VARSisoformA (glean 3_08058)

The construction was made from
7 exons from glean3_28547 plus 1 exon from glean3_03860 to build the N-ter region
and 15 exons from glean3_02908 to build the C-ter region
###Gene_Info_Comments GLEAN3_28547 ###
fragments of Sp-VARS isoformB
complete gene annotated in GLEAN3_02908

55% identity with corresponding region in human VARS2 (P26640)
###Gene_Info_Comments GLEAN3_14014 ###
close similarity in the overlapping region with GLEAN3_01009
###Gene_Info_Comments GLEAN3_01009 ###
close similarity in the overlapping region with GLEAN3_14014
###Gene_Info_Comments GLEAN3_25860 ###
Overlaps with GLEAN_17572
###Gene_Info_Comments GLEAN3_17572 ###
Overlaps with GLEAN3_25860.
###Gene_Info_Comments GLEAN3_22195 ###
a very simila protein is predicted in Glean3_03052. 
###Gene_Info_Comments GLEAN3_03052 ###
 A protein very similar is predicted in GLEAN3_22195
###Gene_Info_Comments GLEAN3_25517 ###
Cyclic ADP-ribose is an important calcium mobilizing metabolite produced by the ADP-ribosyl cyclase (cyclases) family of enzymes. Three evolutionarily conserved ADP-ribosyl cyclase superfamily members have been identified, one from the invertebrate Aplysia californica and two from mammalian tissues, CD38 and CD157.

This annotation most bears most homology to CD157.
###Gene_Info_Comments GLEAN3_18717 ###
matches human amino acid sequence from position 52-396 with 82% identity, BLAST score 1e-167
###Gene_Info_Comments GLEAN3_03985 ###
best vertebrate hit NM_204934
###Gene_Info_Comments GLEAN3_02461 ###
Cyclic ADP-ribose is an important calcium mobilizing metabolite produced by the ADP-ribosyl cyclase (cyclases) family of enzymes. Three evolutionarily conserved ADP-ribosyl cyclase superfamily members have been identified, one from the invertebrate Aplysia californica and two from mammalian tissues, CD38 and CD157.

incomplete sequence that has most homology to the ADP-ribosyl cyclase family member from Aplysia californica 
###Gene_Info_Comments GLEAN3_07538 ###
Three evolutionarily conserved ADP-ribosyl cyclase superfamily members have been identified, one from the invertebrate Aplysia californica and two from mammalian tissues, CD38 and CD157.

This annotation bears greatest homology to family member CD38
###Gene_Info_Comments GLEAN3_10746 ###
This model was modified and annotated based on a manual inspection of multiple protein sequence alignments and its predicted domain architecture.

The 5' features of the corresponding Fgenesh model were chosen over those of the original glean model because they generate a model that better corresponds in domain structure with the genes to which this model Blasts back (B7-1/CD80). The 3' features of the glean/genscan and Fgenesh models are also different. Since they do not give rise to significant differences in domain structure or Blasting to genes from other phyla, we have decided to accept the former.
###Gene_Info_Comments GLEAN3_19025 ###
Appears to be an additional exon of GLEAN3_19024
###Gene_Info_Comments GLEAN3_19026 ###
Appears to be an additional exon of GLEAN3_19024.
###Gene_Info_Comments GLEAN3_11076 ###
This model was annotated based on a manual inspection of multiple protein sequence alignments and its predicted domain architecture.
###Gene_Info_Comments GLEAN3_20633 ###
vertebrate homolog DM. NM 057490
###Gene_Info_Comments GLEAN3_20159 ###
GLEAN3_20159 appears to be C-terminal portion of Chlamydomonas IFT140: AAT95430. eval=1.00E-145. explains aas 183-775 of IFT140.
GLEAN3_21918 appears to be N-terminal portion of IFT140. eval=2.00E-84. explains aas 677-1249 of IFT140.
Annotated by RL Morris.

###Gene_Info_Comments GLEAN3_21918 ###
GLEAN3_20159 appears to be C-terminal portion of Chlamydomonas IFT140: AAT95430. eval=1.00E-145. explains aas 183-775 of IFT140.
GLEAN3_21918 appears to be N-terminal portion of IFT140. eval=2.00E-84. explains aas 677-1249 of IFT140.
Annotated by RL Morris.
###Gene_Info_Comments GLEAN3_23605 ###
GLEAN3_23605 eval=0.0 against "FAP80, IFT122A, Intraflagellar Transport Protein 122A [Chlamydomonas reinhardtii]". 
GLEAN3_23605 explains aas 83-795 of Chlamydomonas FAP80/IFT122A which is 1162 aas long.
Annotated by RL Morris.
###Gene_Info_Comments GLEAN3_25443 ###
This is the 5- end of TIAM2.  The 3'end is GLEAN3_11527
###Gene_Info_Comments GLEAN3_18291 ###
This glean contains a RhoGEF domain but also an F-Box domain????  Either a novel GEF or the F-box is not part of this gene.
###Gene_Info_Comments GLEAN3_20692 ###
This is the 5'end end of the gene.  the rest is located in GLEAN3_07086
###Gene_Info_Comments GLEAN3_07086 ###
This is most of the gene.  However, the 5' end is GLEAN3_20692
###Gene_Info_Comments GLEAN3_04041 ###
This is a duplication of 020022, which looks like a splice variant
###Gene_Info_Comments GLEAN3_12009 ###
This is most of the gene for Dock180, the 5' end is GLEAN3_17939
###Gene_Info_Comments GLEAN3_17939 ###
This is the 5' part of Sp-Dock180.  The rest of the gene is GLEAN3_12009
###Gene_Info_Comments GLEAN3_27143 ###
This is the 5' end of SRGAP.  SRGAP is in three GLEANs, with the middle being 027715 and the bulk (3'end) being 022632
###Gene_Info_Comments GLEAN3_27715 ###
This is the middle part of SRGAP.  This 5' end is 027143, and the 3' end is 022632
###Gene_Info_Comments GLEAN3_26597 ###
This is the 5' end of FAM13A1.  The 3' end is GLEAN3_00258
###Gene_Info_Comments GLEAN3_00258 ###
This is the 3' end of FAM13A1.  The 5' end is GLEAN3_26587.
###Gene_Info_Comments GLEAN3_25352 ###
This is a duplication of most of GLEAN3_00258, which itself is the 3'end of FAM13A1.
###Gene_Info_Comments GLEAN3_12388 ###
This gene was annotated based on a manual inspection of domain architecture and multiple protein sequence alignments.

This model blasts back and shows a similar domain structure (signal peptide/2xIG-v/TM) to that of diverse vertebrate IGSF genes, many of which have a well documented immune function.
###Gene_Info_Comments GLEAN3_26664 ###
C-term part of the protein seems too long
###Gene_Info_Comments GLEAN3_27781 ###
seems to be an artifactual duplication of GLEAN3_23813
... or it's the inverse :)
###Gene_Info_Comments GLEAN3_13624 ###
This model was annotated based on a manual inspection of its predicted domain architecture and its Blasting to known genes.

This model codes for what seems a fairly novel protein, for it shows very weak blasting to known genes in genebank. Its sequence does not show any high confidence domain architecture (as predicted by SMART and Pfam); however, one particularly interesting predicted domain architecture for this model is sp+3xIG[low scores]+TM+ITAM[low score].
###Gene_Info_Comments GLEAN3_28030 ###
N-Term of the protein, seems inaccurate. The good part (half of the peptidase S8 domain which is lacking)is most likely the GLEAN3_28031 

the most C-term part seems as well inaccurate
###Gene_Info_Comments GLEAN3_28031 ###
Seems to be the N'term part (Seems to contain the lacking peptitase S8 domain half) of GLEAN_28030 wich is just near
###Gene_Info_Comments scaffoldi3224 ###
sequence here is only the conserved homeobox domain.
###Gene_Info_Comments GLEAN3_17228 ###
This model was annotated based on a manual inspection of its predicted domain architecture and its similarity to known genes.

Its sequence and domain structure are similar to those of various vertebrate immune IGSF genes (e.g. TCR, CD276, CD4). Furthermore, if the sequence of the glean model is fused to exons called by other predictions, there is an even better Blasting to these genes. Note neither this or any of the other corresponding predictions include a transmembrane domain, indicating that this might be a partial model.  
###Gene_Info_Comments GLEAN3_20457 ###
This model was annotated based on a manual inspection of its domain architecture and alignment to known sequences.

This model Blasts back to human CD276/B7-H3, and it has a partially similar predicted domain structure (sp+V-set+C2-set+TM). The alternative predictions are very similar to the glean model and they do not provide any additional information on this gene.
###Gene_Info_Comments GLEAN3_24439 ###
This model was annotated based on a manual inspection of sequence alignments and predicted domain architectures.

This glean model codes for a protein that Blasts back and has a similar domain structure to that of vertebrate SIRPB2 and SIRPG, although it has a slightly longer cytoplasmic portion. This model is represented in two separate Fgenesh predictions, the first of which has a domain structure more similar to that of SIRPs. However, there is equivalent signal from the tiling array data for all the exons, which would suggest that they all correspond to the same gene; we have therefore accepted the glean model in its original form.
###Gene_Info_Comments GLEAN3_24787 ###
This model codes for what seems a novel domain architecture: 2xIGv+2xCCP/Sushi. Eventhough there are gaps in the sequence between the N-ter IG domains (thus making it possible that this is a "forced" [artifactual] model), the 2nd IG domain and both Sushi domains are encoded in one uninterrupted contig.

The IG and Sushi domains Blast separately to sequences in Genbank, which supports further the idea that, if real, this gene would have a novel domain architecture. Of note, the IG domains blast to various vertebrate IGSF genes relevant for immunity.


###Gene_Info_Comments GLEAN3_28300 ###
This model was modified to incorporate an extra 5' exon based on an otherwise identical Fgenesh prediction. The modified model incorporates a signal peptide into the predicted sequence, which better resembles the structure of the vertebrate B7 family genes to which this gene Blasts back. It codes for sp+V-set+C2-set[low]. The sequence is most likely incomplete (transmembrane domain missing), which is expected since this model locate to a scaffold end.
###Gene_Info_Comments GLEAN3_21592 ###
GLEAN3_21592 eval=1E-30 against "CPC1, Central Pair Complex 1 [Chlaydomonas reinhardtii]"
Annotated by RL Morris, B Rossetti, and A Shorette.
###Gene_Info_Comments GLEAN3_10897 ###
This is a duplicaiton of the SH3 domain of GLEAN 19856
###Gene_Info_Comments GLEAN3_25252 ###
This is a duplicaiton of part of GLEAN3_00206, which itself appears to be a partial PIK3R1.
###Gene_Info_Comments GLEAN3_25017 ###
has a C-terminal C1q domain and ~400 N-terminal extension - but without collagen repeats
###Gene_Info_Comments GLEAN3_24653 ###
C-terminal C1q domain and Nterminal extension of about 200 amino acids but without collagen repeats.
###Gene_Info_Comments GLEAN3_28732 ###
C-terminal C1q domain plus N-terminal extension of about 200 amino acids but without collagen repeats.
###Gene_Info_Comments GLEAN3_24033 ###
C-terminal C1q domain plus N-terminal extension of about 200 amino acids but without collagen repeats.
###Gene_Info_Comments GLEAN3_28510 ###
This model was modified based on a corresponding Genscan model whose domain structure better resembles the vertebrate B7 family of genes (to which this model blasts back). The added exons were indeed predicted by Glean3, but as part of the adjacent glean model (GLEAN3_28511).
###Gene_Info_Comments GLEAN3_02608 ###
This model is located in a small scaffold, and is most likely incomplete (missing a transmembrane domain).


###Gene_Info_Comments GLEAN3_16836 ###
N-terminal EMI domain followed by large number of EGF and EGFLam repeats.  There are homologs in vertebrates but they are a bit shorter and have TM domain.
###Gene_Info_Comments GLEAN3_12032 ###
LRRNT/EGF/FN3/TM - looks as if its missing N-terminus with rest of an LRR unit - such structures exist in vertebrates
###Gene_Info_Comments GLEAN3_10102 ###
SERIES OF LRRtyp REPEATS FOLLOWED BY KR AND 4 FA58C
A novel domain architecture.
###Gene_Info_Comments GLEAN3_19694 ###
GPS/7TM2 - EXTRACELLULAR DOMAIN HAS NO DOMAINS PREDICTED
INTRACELLULAR DOMAIN HAS SEVERAL DOMAINS DOMAINS (SR/WSC/CCP) THAT LOOK SUSPICIOUS
###Gene_Info_Comments GLEAN3_28171 ###
previously cloned gene
###Gene_Info_Comments GLEAN3_16565 ###
previously cloned genomic DNA
###Gene_Info_Comments GLEAN3_17445 ###
This gene model has a CARD domain at the N-terminus and Blasts back to the human VISA/CARDIF/Ips-1/MAVS in the top 10 hips. It however lacks the TM at the C-terminus.
###Gene_Info_Comments GLEAN3_13091 ###
centromere specific histone H3
###Gene_Info_Comments GLEAN3_10425 ###
This GLEAN appears to be an ortholog of the Chlamydoimonas reinhardtii ODA-DC3 gene (C_240117|160952 ODA-DC3, Outer Dynein Arm Docking Complex 3, Mr 25,000
###Gene_Info_Comments GLEAN3_14310 ###
This gene model clusters with the RIG-I family of helicases in phylogenetic tree based on a multiple sequence alignment of the DEXDc domains. 
This gene model has a DEATH domain instead of a CARD domain, located at the C-terminus instead of the N-termimus.
###Gene_Info_Comments GLEAN3_25885 ###
This gene model clusters with the RIG-I family of helicases in phylogenetic tree based on a multiple sequence alignment of the DEXDc domains. 
This gene model has an N-terminal DEATH domain instead of a CARD domain.
###Gene_Info_Comments GLEAN3_11866 ###
This gene model clusters with the RIG-I family of helicases in phylogenetic tree based on a multiple sequence alignment of the DEXDc domains. 
This gene's CARD domain is located at the C-terminus instead of the N-terminus.
###Gene_Info_Comments GLEAN3_14311 ###
This gene model clusters with the RIG-I family of helicases in phylogenetic tree based on a multiple sequence alignment of the DEXDc domains. 
This gene model's CARD domain overlaps the DEXDc domain by ~25 amino acids.
###Gene_Info_Comments GLEAN3_07126 ###
This gene model clusters with the RIG-I family of helicases in phylogenetic tree based on a multiple sequence alignment of the DEXDc domains. 
This gene models codes for a N-terminal DEATH domain instead of a CARD domain.
###Gene_Info_Comments GLEAN3_14119 ###
This gene model clusters with the RIG-I family of helicases in phylogenetic tree based on a multiple sequence alignment of the DEXDc domains. 
This gene combines with part of the sequence from SPU_014118 to make a complete gene model (Helicase_C domain is missing from SPU_014119).
This gene model codes for a N-terminal DEATH domain instead of a CARD domain.
###Gene_Info_Comments GLEAN3_10536 ###
This gene model clusters with the RIG-I family of helicases in phylogenetic tree based on a multiple sequence alignment of the DEXDc domains. 

###Gene_Info_Comments GLEAN3_19134 ###
 extra 400 amino acids on N-terminus
###Gene_Info_Comments GLEAN3_16113 ###
 missing some N-terminus residues
###Gene_Info_Comments GLEAN3_15823 ###
 partial, tiny fragment
###Gene_Info_Comments GLEAN3_03850 ###
 partial, missing N-terminus
###Gene_Info_Comments GLEAN3_04180 ###
 partial, missing N-terminus, may join with Glean3_11497
###Gene_Info_Comments GLEAN3_11497 ###
 partial, missing C-terminus, may join with Glean3_04180
###Gene_Info_Comments GLEAN3_05315 ###
 partial, missing some N-terminus residues
###Gene_Info_Comments GLEAN3_13104 ###
 fragment
###Gene_Info_Comments GLEAN3_16142 ###
 partial, missing N- and C-terminus
###Gene_Info_Comments GLEAN3_16259 ###
 partial, fragment
###Gene_Info_Comments GLEAN3_10535 ###
 partial
###Gene_Info_Comments GLEAN3_13319 ###
 partial
###Gene_Info_Comments GLEAN3_26353 ###
 partial
###Gene_Info_Comments GLEAN3_05476 ###
This gene model clusters with the RIG-I family of helicases in phylogenetic tree based on a multiple sequence alignment of the DEXDc domains. 
This gene model is at the end of a scaffold and is likely incomplete, it is missing part of the DEXDc domains and the effector domain (DEATH or CARD) at the N-terminus. 
###Gene_Info_Comments GLEAN3_19617 ###
This gene model clusters with the RIG-I family of helicases in phylogenetic tree based on a multiple sequence alignment of the DEXDc domains. 
This gene model is a the end of a scaffold and is likely incomplete, it is missing the Helicase_c domain.
###Gene_Info_Comments GLEAN3_16718 ###
This gene model clusters with the RIG-I family of helicases in phylogenetic tree based on a multiple sequence alignment of the DEXDc domains. 
This gene is at the end of a scaffold and is likely incomplete, it is missing the Helicase_c domain.
###Gene_Info_Comments GLEAN3_20020 ###
This gene model clusters with the RIG-I family of helicases in phylogenetic tree based on a multiple sequence alignment of the DEXDc domains. 
However the domain structure is unusual for this type of protein: It has an ERCC4 domain at the C-terminus (after a long stretch of low complexity sequence, which contains a poor hit to a DEATH domain). Both the Genscan and Fgenesh models are identical to the Glean3 prediction.
###Gene_Info_Comments GLEAN3_00006 ###
This gene model clusters with the RIG-I family of helicases in phylogenetic tree based on a multiple sequence alignment of the DEXDc domains. 
This gene model does not appear to encode an "effector" domain (N-terminal CARD domain in vertebrates) It is possible that it is incomplete, although the Genscan and Fgenesh predictions are identical to the Glean3 model and there is another ORF in opposite orientation just upstream.
###Gene_Info_Comments GLEAN3_25888 ###
The gene model C-term is likely incorrect.
###Gene_Info_Comments GLEAN3_12192 ###
potential novel kinase family member. Blasts reasonably well to Fumerate Dehydratase
###Gene_Info_Comments GLEAN3_26236 ###
This prediction does not include the N-terminal sequences of the protein which are present in Glean3_26235.  26235 and 26236 should be combined.
###Gene_Info_Comments GLEAN3_07435 ###
The automated GLEAN prediction contained an exon near the 5'-end of this gene that threw the alignment of the sequence out of line with all other alpha tubulins, so it was deleted.
###Gene_Info_Comments GLEAN3_21670 ###
This GLEAN has disappeared from the curated GLEAN set in GenBoree (043006).
###Gene_Info_Comments GLEAN3_20749 ###
Annotated by BLAST only.
Best human hit is NP_699160
###Gene_Info_Comments GLEAN3_28158 ###
Annotation by BLAST only
partial gene missing N terminus and exon in the middle
###Gene_Info_Comments GLEAN3_13551 ###
Glean model had extra C terminal exon
May be missing small real C terminal
###Gene_Info_Comments GLEAN3_12976 ###
possibly missing 1-2 exons in middle and 1 exon at N terminus
trees with Ciona intestinalis predicted NATs
###Gene_Info_Comments GLEAN3_26720 ###
This model is likely incomplete (sits on a small scaffold) and is almost identical to GLEAN3_03486 annotated by Charlie Whittaker. For this reason we have followed their Whittaker's annotation to annotate this model.
###Gene_Info_Comments GLEAN3_14392 ###
Partial sequence duplication or homolog to GLEAN3_01360.
###Gene_Info_Comments GLEAN3_01360 ###
May represent an duplication or paralog since GLEAN3_14392 also matches a significant part of the published katanin B sequence.
###Gene_Info_Comments GLEAN3_27562 ###
MORC protein sequence revealed putative nuclear localization signals, two predicted coiled-coil structural motifs and limited homology to GHL (GyraseB, Hsp90, MutL) ATPase. Epitope-tagged MORC protein expressed in COS7 cells localized to the nucleus
###Gene_Info_Comments GLEAN3_23500 ###
May be a partial gene.  The best CDS match to meiosis defective 1 only spans the first 450 amino acid residue, and matches with the middle of the mei-1 protein.
###Gene_Info_Comments GLEAN3_28506 ###
Only one predicted exon is conserved:
>GLEAN3_28506|Scaffold82783|17550|17749| DNA_SRC: Scaffold82783 START: 17550 STOP: 17749 STRAND: + 
AAACGAGCAACGCATCCATCGATTACAGCAAAGAGAATGAAACCGCCACTCTTACATTCCCCTCACCCCT
TGCTGTCGGCAGCGGTGATCTGGCCCTGGAGTTCACAGGAGAGCTCAATGATAAGATGAAAGGGTTCTAT
CGAAGCAAGTACACCACACCAGCTGGTGAAGAAAGATACTGTGCTGTTACTCAGTTTGAG

###Gene_Info_Comments GLEAN3_21117 ###
Only two predicted exons are conserved:
>GLEAN3_21117|Scaffold22466|7279|7378| DNA_SRC: Scaffold22466 START: 7279 STOP: 7378 STRAND: + 
CTCAGGCAGAGGTAGACATGAAGGCCTGGTTTTGCCGTAGAGGGCGAAGAAGAAATCCACCACATCACAA
CGTAACCTGGCATCGTGAGATGTACCTGAA
>GLEAN3_21117|Scaffold22466|7944|8065| DNA_SRC: Scaffold22466 START: 7944 STOP: 8065 STRAND: + 
TTCATGTGGGCCCAGAGCCTCTCTACTAGTTCCTCTGTGCTAAGCTTCGACTGTTCCTTGTACTTAAACG
GTGGGTTGGCTATCAGCAGCCGTAGTAGCTTGTGCCTGGGTTGGAGATACAT

###Gene_Info_Comments GLEAN3_04070 ###
fragment
###Gene_Info_Comments GLEAN3_26235 ###
fragment
###Gene_Info_Comments GLEAN3_12022 ###
The following exons are the only ones conserved with other CPA proteins.
>GLEAN3_12020|Scaffold48388|53670|53789| DNA_SRC: Scaffold48388 START: 53670 STOP: 53789 STRAND: + 
CTTGATTTCTGGAAGCGCCCGTCGAAGGTTGGACGGCCCGTCGACGTGATGGTCTCCCCCGCCCAGCAGT
TGAGCTTCGTTAGCTCCGCGAGCCGCCCTGGACTCTTTATCGAGACTTGG
>GLEAN3_12020|Scaffold48388|55561|55662| DNA_SRC: Scaffold48388 START: 55561 STOP: 55662 STRAND: + 
ATCGCGAAGTCACCATCTGCAACGAACGTGGCTTACATCCAGGGAGGCATCCACGCCCGCGAATGGGTCA
GCCCAGCTACAGTCATCAACCTCATCAAAAAT
>GLEAN3_12020|Scaffold48388|56376|56483| DNA_SRC: Scaffold48388 START: 56376 STOP: 56483 STRAND: + 
TACATAGATAACTACGGCAGTGATGATACGGTGACGAGCATGTTGGATAACTTTGTGTGGATCATTGTAC
CCGTCTACAACATCGATGGATACAAGTTCAGCCACACC
>GLEAN3_12020|Scaffold48388|58257|58347| DNA_SRC: Scaffold48388 START: 58257 STOP: 58347 STRAND: + 
GACGATCGTATGTGGCGCAAGAATCGCAACCCCAACGTAGGAGGCTGCGCTGGAGTCGATCTGAACCGCA
ACTATGACTTCGAGTGGGGAG
>GLEAN3_12020|Scaffold48388|61004|61206| DNA_SRC: Scaffold48388 START: 61004 STOP: 61206 STRAND: + 
GTGCCAGCAAACAGAGGTGTACCCAGGATTATCAGGGCACAGAGCCGCTGAGTGAACCCGAGAACAGCGG
CTCCAAGGCTTTCCTGCAAGGCTTTGGTTCAAACCTCAAACTCTTCATTGATTTCCACGCCTATGGCCAG
TACTGGCTCTACCCATGGGGTTACACCAGGAGAACCCTTGCACAACCAGATAGAGACGATCAG
>GLEAN3_12020|Scaffold48388|62567|62757| DNA_SRC: Scaffold48388 START: 62567 STOP: 62757 STRAND: + 
ACCCCGCAACAGGTGCAAGCGAAGACTTTGGATACGGCTCCCTGGGTGTGAAGTACACCTACGTGGTGGA
GCTGAGGGATGAGGGCACTTTCGGGTTCTCGCTCCCCGCCTACCAGATCCAGCCCACCGGTGAGGAGATC
TTCGCCGGTATGAAGACACTCGGCAAGCAGCTCGTTGCCGAGTATGCTTAG
		
###Gene_Info_Comments GLEAN3_27565 ###
coomparison to best blast hit suggests that the prediction may be missing N-terminal sequence.
###Gene_Info_Comments GLENA3_03774 ###
small C-type lectin
###Gene_Info_Comments GLEAN3_06011 ###
potential novel kinase family member
###Gene_Info_Comments GLEAN3_10455 ###
potential novel kinase family member
###Gene_Info_Comments GLEAN3_11808 ###
potential novel kinase family member
###Gene_Info_Comments GLEAN3_12859 ###
potential novel kinase family member
###Gene_Info_Comments GLEAN3_13806 ###
potential novel kinase family member
###Gene_Info_Comments GLEAN3_16864 ###
potential novel kinase family member
###Gene_Info_Comments GLEAN3_17078 ###
potential novel kinase family member
###Gene_Info_Comments GLEAN3_18352 ###
potential novel kinase family member
###Gene_Info_Comments GLEAN3_20735 ###
potential novel kinase family member
###Gene_Info_Comments GLEAN3_22611 ###
potential novel kinase family member
###Gene_Info_Comments GLEAN3_28859 ###
potential novel kinase family member
###Gene_Info_Comments GLEAN3_28625 ###
probably TK family member

###Gene_Info_Comments GLEAN3_02076 ###
probable TKL family member
###Gene_Info_Comments GLEAN3_06622 ###
probable TKL family member

###Gene_Info_Comments GLEAN3_08015 ###
probable TKL family member

###Gene_Info_Comments GLEAN3_02493 ###
potential novel kinase family member
###Gene_Info_Comments GLEAN3_05457 ###
potential novel kinase family member
###Gene_Info_Comments GLEAN3_22612 ###
potential novel kinase family member
###Gene_Info_Comments GLEAN3_12527 ###
Best alignment in the central region, where coiled coil domain is likely encoded.
###Gene_Info_Comments GLEAN3_21921 ###
partial sequence; based on NCBI predicted data set
###Gene_Info_Comments GLEAN3_09093 ###
very similar to 0014185
###Gene_Info_Comments GLEAN3_08923 ###
distinct from 07733, 25829
###Gene_Info_Comments GLEAN3_25421 ###
Eval 8e-12 against cd00159, RhoGAP, GTPase-activator protein for Rho-like GTPases...	
181 aas are 100% identical to NCBI prediction XP_798334 "PREDICTED: similar to Kinesin-like motor protein C6orf102, partial [Strongylocentrotus purpuratus]"
Annotated by RL Morris

###Gene_Info_Comments GLEAN3_15483 ###
polymeric globin gene similar to shrimp polymeric globin
###Gene_Info_Comments GLEAN3_08766 ###
GLEAN3_08766 is exactly identical to the C-terminal region of XP_789874.  Evalue =1e-88  for XP_789874 against against "NP_055690.1|  kinesin family member 14 [Homo sapiens]".  
Annotated by RL Morris
###Gene_Info_Comments GLEAN3_01284 ###
Eval=2e-113 using GLEAN3_01284 against "NP_055889.2|  kinesin family member 1B isoform b [Homo sapiens]"
Annotated by RL Morris
###Gene_Info_Comments GLEAN3_09400 ###
AAF04841 = Sp-kinesin-C cloned sequence.  
e val against AAF04841 = 0.
Annotated by RL Morris.
###Gene_Info_Comments GLEAN3_15484 ###
Homologous to DNA-PKcs, 5'end of gene is most likely missing. This model shows similarity to the N-ter sequence of DNA-PKcs. Glean3_15529 contains the 3'end / carboxy-terminus of the gene.
###Gene_Info_Comments GLEAN3_11294 ###
Middle third of this gene is SpILKAP b.
###Gene_Info_Comments GLEAN3_04669 ###
Exon 1 not confirmed by cDNA;
5' end at edge of 16 kb contig;	
only gene on contig

###Gene_Info_Comments GLEAN3_20182 ###
Allele of GLEAN3_0466, 3' end;
one exon in 13 kb contig
###Gene_Info_Comments GLEAN3_13544 ###
Gene model may be missing N-terminal 70-80 aa based on alignment to mammalian homologs.
###Gene_Info_Comments GLEAN3_27014 ###
Blasts to PTPRT, but forms part of a novel clade in phylogenetic analysis with PTPRFn1 and PTPRLec genes.  Partial sequence. See also GLEAN3_08466, GLEAN3_23162, and GLEAN3_24820.
###Gene_Info_Comments GLEAN3_20604 ###
Blasts to PTPRA, but in phylogenetic analysis it forms part of a novel clade with PTPRLec1, PTPRLec2, PTPRLec4, PTPRLec5, PTPRLec6, PTPRFn1, and PTPRFn2.
###Gene_Info_Comments GLEAN3_23115 ###
This protein has structure typical of a PTPR, but does not clade with known PTPRs.  It forms a unique clade and has been renamed PTPRiz.
###Gene_Info_Comments GLEAN3_28575 ###
Blasts to Receptor-type tyrosine-protein phosphatase mu precursor, but does not clade with the PTPR K/M/T/U group in phylogenetic analysis.  Renamed PTPRY3.  Clades with GLEAN3_15923 and GLEAN3_20542 (PTPRY2).
###Gene_Info_Comments GLEAN3_19920 ###
Partial sequence.
###Gene_Info_Comments GLEAN3_00652 ###
See putative conserved domains at:
http://www.ncbi.nlm.nih.gov/Structure/cdd/wrpsb.cgi?RID=1137535304-18825-160717183845.BLASTQ4
###Gene_Info_Comments GLEAN3_22669 ###
Blasts to MTMR6, homologous to myotubularin related proteins 6, 7, and 8 in phylogenetic analysis.
###Gene_Info_Comments GLEAN3_11860 ###
This blasts to PPEF1, but phylogenetic analysis showed that it was more similar to PPEF2.  Glean3_08844 is likely the identical protein.
###Gene_Info_Comments GLEAN3_24691 ###
GLEAN3_05570 is a partial duplicate prediction.
###Gene_Info_Comments GLEAN3_05570 ###
Partial duplcate prediction for GLEAN3_24691.
###Gene_Info_Comments GLEAN3_14812 ###
Gene sequence modified as per HMM model
###Gene_Info_Comments GLEAN3_25188 ###
sequence modified as per HMM model
###Gene_Info_Comments HIPKa ###
HMM prediction

###Gene_Info_Comments GLEAN3_04962 ###
protein sequence modified as per HMM model; this is consistent with tiling data
###Gene_Info_Comments GLEAN3_15281 ###
protein sequence modified as per HMM model
###Gene_Info_Comments GLEAN3_26828 ###
Protein sequence modifidied as per HMM prediction
###Gene_Info_Comments GLEAN3_07333 ###
Protein sequence modified as per HMM model
###Gene_Info_Comments Nek1c ###
Prediction as per HMM model
###Gene_Info_Comments GLEAN3_14818 ###
Protein sequence modified as per HMM prediction
###Gene_Info_Comments GLEAN3_16754 ###
prediction is probably wrong- way too long. Predicted HMM model sequence is:

LTQAFGRSTAVLSMIGYIIGLGDRHLDNVLVNFVTGEVVHIDYNVCFEKG
KNLRVPERVPFRMTQNVQAALGITGVE

###Gene_Info_Comments SFK1b ###
protein sequence (incomplete) as per HMM prediction. 
###Gene_Info_Comments GLEAN3_24930 ###
see comments for glean 24721. This is 3' part of gene. Predicted protein sequence is
MAQAYVEDTLMVRHELKHAYGKVFLVRKVGNHNQGKLYAMKVLKKATIVQ
KAKTAEHTMTERQVLEAVRSCPFLVTLHYAFQTDSKLNLILDYVNGGELF
THLYQREHFRESEVRIYIAEIIIALDCLHRILTSHPPMPNTFSKEVKDFI
NKLLVKDPTKRLGCNGVKDIKSHSFFKGLNWDDVAAKRVSPPFRPHINGE
LDTSNFAEEFTSLVPADSPADIPKTADARVFRVGYSFIAPSILYSDNAIT
QDMLTQPSEHNRPSLASILSIHELKDSPFNKYYELDMKSAPIGDGSFSIC
RRCTHRKTEKEYAVKIVSRRVACTQEITTLQLCQKHPNIVHLKEEFKDKL
HTYIIMELCKGGELLGRIRKKKHFDELEASMIMRKLVSAVDYMHSRGIVH
RDLKPENILFTDDSDDAELKIIDFGFARITNSNQPLKTPCFSLHFAAPEV
LKRAYEQDGEYDASCDVWSLGVILYTMLSGRVPFQDPSISKSNSASDIMK
RIKHGNFSFDGEEWNSVSTPAKDLIKGLLTVDPSRRLTTDDLLQNEWIQG
QQLSTSTPLMTPDILNSCASIQKRVKATMRAFHTAQREGFLLTDVSNAPL
AKRRKKKKDSSTETRSSSSESTHSQSSSSQESTTPTPTANPVLTIPVTTV
SCAPRTTTATGAPSIPSVQPLPSLSKQTGARLDQYESLESLGFSPILPFS
AGGSQELPPLLARQDSGYVGQMPSYAQVTPVPRTNVGSHGVTYAPILDPS
MYPCGLQQPILDFSSSIPEYLSVQYASTEQPSIPMTVPRTLHQPHPHPLP
LPHQHLSHLPTISEDPSTT

###Gene_Info_Comments GLEAN3_25882 ###
This is likely to be the N terminus of 14404 (DAPK). The fused protein sequence is pasted below, but is probably still missing a few exons based on alignment to human DAPK1 (NP_004929).

MAMFRTESVEEFYQIGEDIGSGQFSEVKKVTEKSTGKDYAGKFIRKRRST
ASRRGVKREDIVREVSILEELSHDNIISLHDAFELQKEVVLILELVTGGE
LFHYLAEEDHVNEEVAAQFVKKILEALKHMHDRNICHLDLKPENIMLLNR
NTQNIMLIDFGLSRRIKPGEDIRDIMGTAEFVAPEIINFEPLSLNTDMWP
VFISSRLPNIQLIQLSRSQVIQLVCVFISPKMGHRREQLALRKMSKALRS
DWEHGETALHLAAGYGHVDILEYLQAKGASIDVADKTGETPLHVAGRYGQ
VEAVQYLCDQAVNSNLADEDGETPLHIAAWHGYTSIVQTLCKAGATLDLK
NKDGETTLLCAAARGHLDIVKILVEAGALLNTIDKHGITPLHHAVRRQHY
DIVKYLVDSNCDVNLQDKLGDTPLNVACKEGALDLVEMLHAVGAKRDILN
RHKNSALHMAARGGHIEVVRYLCLAGALIHQRNQDGLTASQLASLEGHED
VADVLTQVEGDKSKDLFINQLNSTSGPLHRIRIKVLGQSGVGKTALIDSL
KCGYFRGLFRRSRSNISLIGSSSNGRSSPRSPRSPRSPLTPMFGNGKKMD
GGRFFMESLKRKQLSSTSSSFDVDSEVTRGIEFTHGTIPGAGDFTFLEFS
GEDTYHTAYPHFLSDEGAIHLVVFSLDDMFEEQLAQVTYWMNFLRSQLPA
TEPVGYCGKYRQQPKIALVATHADHTQCPKQPTGELISGEGNIVLYQTKR
LFGRLFDICDVLFVMDANSAQSKDVKMLRTHISSLRNSILKDKSKDLFIN
QLNSTSGPLHRIRIKVLGQSGVGKTALIDSLKCGYFRGLFRRSRSNISLI
GSSSNGRSSPRSPRSPRSPLTPMFGNGKKMDGGRFFMESLKRKQLSSTSS
SFDADSEVTRGIEFTHGTIPGAGDFTFLEFSGEDTYHTAYPHFLSDEGAI
HLVVFSLDDMFEEQLAQVTYWMNFLRSQLPATEPVGYCGKYRQQPKIALV
ATHADHTQCPKQPTGEMISGEGNIVLYQTKRLFGRLFDICDVLFVMDANS
AQSKDVKMLRTHISSLRNSILKVEAPVSVLCEAVASALPAWRRTFVNFPV
LTWQQFSEGIHASINPLAGQAHLREVGRQLHLMGEVQCFGSELLQEVIVI
EPTWLCSGIIGRLLSHDATEQPEGQYSIHYIQSLFPDTDAMDISQLMEAM
DICVHGTVCEIPAVMRCPAPEGIWEKEDENGNFRVYGGVRMQLSDCGSTL
PSGLFSRIQMSLRRNFQQDMEDTTDNELVMWRNGAKCSSGSIEGLISMTN
DECAIEIKVRGYNDTRQGCFIFLEDLVHLVKHVLVDSYPGLPLNMEVLSP
IQLSSHEKTIMVYNACSLLRLQLRTERTVENPISNQEEDFVDIFCFGSES
VESNLIAGVDLHLSEIPSLTRRQISMLLDPPDPMGKDWCLLAVGLGLTEK
IPMLDTLNRRCGPDESDSPTERLLQEWGKEETNSVGVLLNKVKDLGREDV
LRVLMQGSPLYKFVPDPRALEEGRQSGSGSNHSSGTVASR

###Gene_Info_Comments GLEAN3_14118 ###
This gene combines with part of the sequence from SPU_014119 to make a complete gene model (Helicase_C domain is missing from SPU_014119).
See other gene model for further details.
###Gene_Info_Comments GLEAN3_11306 ###
in progress
###Gene_Info_Comments GLEAN3_26949 ###
more similar to vertebrate transferrins than invertebrate
appears to be novel form
###Gene_Info_Comments Sp-Mafs ###
Likely missing first exon (non-coding) and second exon encoding N-terminal sequence.
First reported by Coolen, et al. (2005). Phylogenomic analysis and expression patterns of large Maf genes in Xenopus tropicalis provide new insights into the functional evolution of the gene family in osteichthyans. Dev Genes Evol 215, 327-39.
###Gene_Info_Comments GLEAN3_09476 ###
missing exon in middle
###Gene_Info_Comments GLEAN3_08954 ###
There seem to be two MAP1A/1B_LC3-like proteins encoded in the S. purpuratus genome: GLEAN3_09444 (~72% sequence identity) and GLEAN3_08954 (~60% sequence identity).
###Gene_Info_Comments GLEAN3_04008 ###
Contains FHA domain and Reverse transcriptase domain
Forkhead associated domain (FHA); found in eukaryotic and prokaryotic proteins. Putative nuclear signalling domain. FHA domains may bind phosphothreonine, phosphoserine and sometimes phosphotyrosine. In eukaryotes, many FHA domain-containing proteins localize to the nucleus, where they participate in establishing or maintaining cell cycle checkpoints, DNA repair, or transcriptional regulation. Members of the FHA family include: Dun1, Rad53, Cds1, Mek1, KAPP(kinase-associated protein phosphatase),and Ki-67 (a human nuclear protein related to cell proliferation).
###Gene_Info_Comments GLEAN3_01270 ###
 only N-terminus fragment
###Gene_Info_Comments GLEAN3_01271 ###
 only N-terminus fragment
###Gene_Info_Comments GLEAN3_03343 ###
 fragment
###Gene_Info_Comments GLEAN3_10265 ###
 partial, missing C-terminus half, should join with Glean3_10266
###Gene_Info_Comments GLEAN3_10266 ###
 fragment, should join with Glean3_10265
###Gene_Info_Comments GLEAN3_10602 ###
 fragment
###Gene_Info_Comments GLEAN3_10738 ###
 fragment
###Gene_Info_Comments GLEAN3_13126 ###
 fragment
###Gene_Info_Comments GLEAN3_13531 ###
 partial, missing N-terminus
###Gene_Info_Comments GLEAN3_14927 ###
 fragment
###Gene_Info_Comments GLEAN3_15609 ###
 fragment
###Gene_Info_Comments GLEAN3_16267 ###
 fragment
###Gene_Info_Comments GLEAN3_17253 ###
 fragment
###Gene_Info_Comments GLEAN3_18825 ###
 partial
###Gene_Info_Comments GLEAN3_19381 ###
 fragment
###Gene_Info_Comments GLEAN3_19849 ###
 partial, missing N- and C-terminus
###Gene_Info_Comments GLEAN3_20358 ###
 partial, missing N-terminus
###Gene_Info_Comments GLEAN3_22124 ###
 fragment, missing N-terminus region
###Gene_Info_Comments GLEAN3_01506 ###
 fragment
###Gene_Info_Comments GLEAN3_02449 ###
 fragment
###Gene_Info_Comments GLEAN3_03043 ###
 fragment
###Gene_Info_Comments GLEAN3_08512 ###
Alignmenet with best blast sequence suggests that the model may lack N- and C-terminal sequences.

There are a large number of nearly identical sequences on many  scaffolds that are not on the glean3 list.  There are also multiple copies on the glean3 list.
###Gene_Info_Comments GLEAN3_08512 ###
 fragment
###Gene_Info_Comments GLEAN3_10028 ###
 fragment
###Gene_Info_Comments GLEAN3_10065 ###
 fragment
###Gene_Info_Comments GLEAN3_10066 ###
 fragment
###Gene_Info_Comments GLEAN3_10275 ###
This is the N termainal region of PLCg.  The full annotation can be found on scaffold 53431
 and GLEAN3_27462

###Gene_Info_Comments GLEAN3_10275 ###
 fragment, extra stretch in middle
###Gene_Info_Comments GLEAN3_01514 ###
 fragment, extra mismatched stretch on N-terminus
###Gene_Info_Comments GLEAN3_03613 ###
 partial
###Gene_Info_Comments GLEAN3_04955 ###
 partial
###Gene_Info_Comments GLEAN3_06198 ###
 extra mismatched long stretch on C-terminus
###Gene_Info_Comments GLEAN3_06971 ###
 fragment
###Gene_Info_Comments GLEAN3_08301 ###
 fragment
###Gene_Info_Comments GLEAN3_08303 ###
 partial
###Gene_Info_Comments GLEAN3_08612 ###
 fragment
###Gene_Info_Comments GLEAN3_09147 ###
 fragment, unmatched stretch on N-terminus
###Gene_Info_Comments GLEAN3_06296 ###
Hypothetical protein
similarities with solute carrier
ATP binding cassette
lactotrasferrin
###Gene_Info_Comments GLEAN3_01715 ###
Hypothetical protein with no homologs in other species
###Gene_Info_Comments GLEAN3_05307 ###
probable ortholog of human Zinc-finger 318
the naming is different form the Stefan Materna naming 
because of some uncertainties in the homology 
###Gene_Info_Comments GLEAN3_01945 ###
Hypothetical protein with BLAST hits with various prots with similiraties with Kazrin (among others)
No clear orthology
 
Contains SMC and 3 SAM domains 
SMC=nucleotide binding cassette
SAM=Sterile alpha motif.; Widespread domain in signalling and nuclear proteins. In EPH-related tyrosine kinases, appears to mediate cell-cell initiated signal transduction via the binding of SH2-containing proteins to a conserved tyrosine that is phosphorylated. In many cases mediates homodimerization.

###Gene_Info_Comments GLEAN3_21165 ###
PREDICTED: similar to Laminin-like protein K08C7.3 precursor 

###Gene_Info_Comments GLEAN3_06349 ###
transposon
###Gene_Info_Comments GLEAN3_22598 ###
PREDICTED: similar to flavin containing monooxygenase 5

Contains a long Fibrinogen-related domain (FReD); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation.

###Gene_Info_Comments GLEAN3_05714 ###
Hypothetical prot similar to CTD binding prot

Contains a RING and a PHD domain

RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)

PHD-finger. PHD folds into an interleaved type of Zn-finger chelating 2 Zn ions in a similar manner to that of the RING and FYVE domains.
###Gene_Info_Comments GLEAN3_22038 ###
contains two Cap-ED domains 
PREDICTED: similar to cAMP-dependent protein kinase type I-alpha  regulatory subunit
###Gene_Info_Comments GLEAN3_07251 ###
PREDICTED: similar to sequestosome 1 isoform 1

Contains PB1 sdomain and ZZ domain
PB1 domain ; Phox and Bem1p domain, present in many eukaryotic cytoplasmic signalling proteins. The domain adopts a beta-grasp fold, similar to that found in ubiquitin and Ras-binding domains. A motif, variously termed OPR, PC and AID, represents the most conserved region of the majority of PB1 domains, and is necessary for PB1 domain function. This function is the formation of PB1 domain heterodimers, although not all PB1 domain pairs associate

Zinc finger, ZZ type. Zinc finger present in Drosophila ref(2)P, NBR1, Human sequestosome 1 and related proteins. The ZZ motif coordinates two zinc ions and most likely participates in ligand binding or molecular scaffolding. Drosophila ref(2)P appears to control the multiplication of sigma rhabdovirus. NBR1 (Next to BRCA1 gene 1 protein) interacts with fasciculation and elongation protein zeta-1 (FEZ1) and calcium and integrin binding protein (CIB), and may function in cell signalling pathways. Sequestosome 1 is a phosphotyrosine independent ligand for the Lck SH2 domain and binds noncovalently to ubiquitin via its UBA domain.
###Gene_Info_Comments GLEAN3_10285 ###
PREDICTED: similar to ubiquitously transcribed tetratricopeptide  repeat gene, X chromosome

Contains 1 TPR domain and 1 jmjC domain

Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here

jmjC domain. The jmjC domain is thought to be involved in chromatin organisation by modulating heterochromatisation.

###Gene_Info_Comments GLEAN3_16785 ###
Homology is no very strong (7e-11)
Best genbank hit is form Danio rerio 
name is temporary waiting for a better one
###Gene_Info_Comments GLEAN3_16786 ###
PREDICTED: similar to Retinal homeobox protein Rx3 
###Gene_Info_Comments GLEAN3_16787 ###
PREDICTED: similar to Alpha-1B adrenergic receptor
Contains a 7 transmembrane receptor (rhodopsin family)(8e-18)
###Gene_Info_Comments GLEAN3_19627 ###
Predicted: putative protein which
contains a Putative homoserine kinase type II (protein kinase fold) [General function prediction only]
the level of homologyis weak 
###Gene_Info_Comments GLEAN3_23530 ###
predicted protein with similarities to At rich interactive domain Swi1-like
###Gene_Info_Comments GLEAN3_24578 ###
PREDICTED: similar to zn-finger, CCHC type and RNA-directed DNA polymerase  and Integrase, catalytic domain containing protein 
family member (1E419)

Hypothetical prot with reverse transcriptase and integrase domains
###Gene_Info_Comments GLEAN3_15699 ###
This gene encodes a signal peptide and a C1q domain.
###Gene_Info_Comments GLEAN3_15700 ###
this gene encodes a signal peptide and a C1q domain
###Gene_Info_Comments GLEAN3_24010 ###
this may be a gene fragment
###Gene_Info_Comments GLEAN3_25635 ###
GLEAN3_03194 may represent the latter half of this gene. 
###Gene_Info_Comments GLEAN3_12648 ###
The prediction appears to be a combination of two genes, one encoding a bzip transcription factor and they other a phosphatidylinositol glycan.  Only GLEAN3_12648|Scaffold85774|13949|15052| is conserved with the bzip transcription factor gene, zhangfei.
###Gene_Info_Comments GLEAN3_23034 ###
GLEAN3_14930 is a partial duplicate prediction.
###Gene_Info_Comments GLEAN3_14930 ###
Partial duplicate prediction of GLEAN3_23034
###Gene_Info_Comments GLEAN3_11973 ###
GLEAN3_11973 has part I. GLEAN3_01984 has the middle part. Last part may be missing.
###Gene_Info_Comments GLEAN3_01984 ###
GLEAN3_11973 has part I. GLEAN3_01984 has the middle part. Last part may be missing.
###Gene_Info_Comments GLEAN3_00127 ###
GLEAN3_09993 is a duplicate prediction.
###Gene_Info_Comments GLEAN3_09993 ###
GLEAN3_09993 is a duplicate prediction for GLEAN3_00127.
###Gene_Info_Comments GLEAN3_00893 ###
GLEAN3_00893 encodes the middle part of USP34. GLEAN3_13835 encodes the last. GLEAN3_13835 and GLEAN3_08562 are overlapping duplicate predictions. First part of the gene homologous to the human USP34 is missing. GLEAN3_28691 (homologous with ~12-312 human USP34) and GLEAN3_05174 (homologous with ~ 698-910 aa of human protein) are only partially similar to the human USP34.
###Gene_Info_Comments GLEAN3_02882 ###
GLEAN3_02882 is a partially incorrect prediction for USP46. It appears to have an extra exon at the beginning and ~300 aa at the end are non-homologous.
###Gene_Info_Comments GLEAN3_03944 ###
GLEAN3_03944 has part I, GLEAN3_03945 as part II and GLEAN3_23070 has part III of USP32 gene. In addition, GLEAN3_03945 and GLEAN3_23070 share an overlap of ~100 AA.
###Gene_Info_Comments GLEAN3_03945 ###
GLEAN3_03944 has part I, GLEAN3_03945 as part II and GLEAN3_23070 has part III of USP32 gene. In addition, GLEAN3_03945 and GLEAN3_23070 share an overlap of ~100 AA.
###Gene_Info_Comments GLEAN3_23070 ###
GLEAN3_03944 has part I, GLEAN3_03945 as part II and GLEAN3_23070 has part III of USP32 gene. In addition, GLEAN3_03945 and GLEAN3_23070 share an overlap of ~100 AA.
###Gene_Info_Comments GLEAN3_08736 ###
Human USP48 gene isoforms are very different in size. Isoform a is 1035 aa long where as isoform b is 485 aa long. GLEAN3_08736 is homologous to the isoform b but along with GLEAN3_01900 may also be considered homologous to human USP48 isoform a. There are ~170 aa missing from the isoform a in urchin, if GLEAN3_08736 and 01900 are coding for USP48 isoform a.
###Gene_Info_Comments GLEAN3_01900 ###
Human USP48 gene isoforms are very different in size. Isoform a is 1035 aa long where as isoform b is 485 aa long. GLEAN3_08736 is homologous to the isoform b but along with GLEAN3_01900 may also be considered homologous to human USP48 isoform a. There are ~170 aa missing from the isoform a in urchin, if GLEAN3_08736 and 01900 are coding for USP48 isoform a.
###Gene_Info_Comments GLEAN3_09955 ###
GLEAN3_21779 is a partial duplicate prediction for GLEAN3_09955.
###Gene_Info_Comments GLEAN3_06393 ###
GLEAN3_06393 encodes fisr half of the gene. GLEAN3_11524 has the rest.
###Gene_Info_Comments GLEAN3_11524 ###
GLEAN3_06393 encodes fisr half of the gene. GLEAN3_11524 has the rest.
###Gene_Info_Comments GLEAN3_11940 ###
Appears to be missing ~200 at the beginning of USP22.
###Gene_Info_Comments GLEAN3_21779 ###
GLEAN3_21779 is a partial duplicate prediction for GLEAN3_09955.
###Gene_Info_Comments GLEAN3_28265 ###
First ~400 aa (as compared to human USP10) are missing from GLEAN3 predictions.
###Gene_Info_Comments GLEAN3_13119 ###
this is the same as GLEAN3_02448.
###Gene_Info_Comments GLEAN3_15421 ###
also sim to glean3_22485
###Gene_Info_Comments Sp-Il17-9 ###
This gene was annotated based on FgeneshAB and Fgenesh++.

###Gene_Info_Comments Sp-Il17-10 ###
This gene was annotated based on FgeneshAB and ++.
###Gene_Info_Comments Sp-Il17-11 ###
This gene was annotated based on FgeneshAB and ++.

###Gene_Info_Comments Sp-Il17-12 ###
This gene was annotated based on FgeneshAB and ++. It may be partial because unknown sequence is located behind the first exon.
###Gene_Info_Comments Sp-Il17-13 ###
This gene model was annotated based on FgeneshAB and ++.

###Gene_Info_Comments Sp-Il17-p1 ###
This gene model was annotated based on FgeneshAB and ++. It is partial and may link to Sp-Il17-12.
###Gene_Info_Comments Sp-Il17-14 ###
This gene model was annotated based on FgeneshAB and ++.

###Gene_Info_Comments Sp-Il17-15 ###
This gene model was annotated based on FgeneshAB and ++.

###Gene_Info_Comments Sp-Il17-16 ###
This gene model was annotated based on FgeneshAB and ++.

###Gene_Info_Comments Sp-Il17-17 ###
This gene model was annotated based on FgeneshAB and ++.

###Gene_Info_Comments Sp-Il17-p2 ###
This partial gene model was annotated based on FgeneshAB. The model is  located in a short scaffold.

###Gene_Info_Comments Sp-Il17-p3 ###
This partial gene model was annotated based on FgeneshAB, ++ and BlastN.  The model is located at the end of a contig. 
###Gene_Info_Comments Sp-Il17-18 ###
This gene was annotated based on FgeneshAB and ++.

###Gene_Info_Comments Sp-Il17-19 ###
This gene was annotated based on FgeneshAB and ++.

###Gene_Info_Comments Sp-Il17-20 ###
This gene model was annotated based on FgeneshAB and ++.

###Gene_Info_Comments Sp-Il17-21 ###
This gene model was annotated based on FgeneshAB and ++. The model is partial and located at the end of a scaffold.

###Gene_Info_Comments Sp-Il17-22 ###
This gene model was annotated based on a part of NCBI and FgeneshAB prediction. The model is partial and located at the end of a contig.


###Gene_Info_Comments Sp-Il17-p4 ###
This gene model was annotated based on FgeneshAB. The model was partial in a short scaffold, so Il17 domain was not found. But the nucleotide sequence was 93% similar to that of aother Sp-Il17.
###Gene_Info_Comments Sp-Il17-p5 ###
This gene model was annotated based on FgeneshAB and ++. The model was located at the end of a scaffold and Il17 domain was not found. But the nucleotide sequence was 93% similar to another Sp-Il17.

###Gene_Info_Comments Sp-Il17-23 ###
This gene model was annotated based on FgeneshAB and ++. It has a partial Il17 domain.
###Gene_Info_Comments Sp-Il17-24 ###
This gene model was annotated based on FgeneshAB and ++.

###Gene_Info_Comments GLEAN3_24199 ###
May have an extra exon at the end.
###Gene_Info_Comments GLEAN3_14962 ###
It is possible that GLEAN3_00634 may code for the first part of this protein.
###Gene_Info_Comments GLEAN3_17602 ###
GLEAN3_16877 may represent the first half of this gene.
###Gene_Info_Comments GLEAN3_17871 ###
Duplicate prediction for GLEAN3_07962.
###Gene_Info_Comments GLEAN3_16537 ###
GLEAN3_01217 is a longer duplicate prediction.
###Gene_Info_Comments GLEAN3_01217 ###
GLEAN3_01217 is a longer duplicate prediction for GLEAN3_16537.
###Gene_Info_Comments GLEAN3_00228 ###
GLEAN3_03049 had first half of the HCFC2 gene. GLEAN3_00228 likely codes the rest, it may be incorrectly predicted.
###Gene_Info_Comments GLEAN3_03049 ###
GLEAN3_03049 had first half of the HCFC2 gene. GLEAN3_00228 likely codes the rest, it may be incorrectly predicted.
###Gene_Info_Comments GLEAN3_05886 ###
Missing first exon?
###Gene_Info_Comments GLEAN3_22607 ###
GLEAN3_28866 is a partial duplicate prediction for GLEAN3_22607.
###Gene_Info_Comments GLEAN3_23287 ###
GLEAN3_08313 is a partial duplicate prediction.
###Gene_Info_Comments GLEAN3_24246 ###
GLEAN3_24246 codes for part I of TSC2 and GLEAN3_23402 codes the rest.
###Gene_Info_Comments GLEAN3_23402 ###
GLEAN3_24246 codes for part I of TSC2 and GLEAN3_23402 codes the rest.
###Gene_Info_Comments GLEAN3_28104 ###
GLEAN3_28850 is a duplicate long prediction that is likely incorrect.
###Gene_Info_Comments GLEAN3_28850 ###
Incorrect longer prediction for RPL15. 
###Gene_Info_Comments GLEAN3_23692 ###
Appears to be short by ~29 aa at beginning.
###Gene_Info_Comments GLEAN3_05571 ###
Incomplete gene model.
###Gene_Info_Comments GLEAN3_09084 ###
Missing last half.
###Gene_Info_Comments GLEAN3_24153 ###
Nterminus of APC1; the remaining parts of the gene are found in GLEAN3_08018 and 12580
###Gene_Info_Comments GLEAN3_08018 ###
see GLEAN3_24153 for annotation
###Gene_Info_Comments GLEAN3_12580 ###
see GLEAN3_24153 for annotation
###Gene_Info_Comments GLEAN3_21616 ###
This Glean contains the C-terminal sequence of Psf2 (aa 186-301). 
###Gene_Info_Comments GLEAN3_17818 ###
GLEAN3_17818 covers the C-terminal sequence of the GINS protein subunit Psf1
GLEAN3_17817 covers the N-terminal sequence of the GINS protein subunit Psf1
###Gene_Info_Comments GLEAN3_17817 ###

GLEAN3_17817 covers the N-terminal sequence of the GINS protein subunit Psf1
GLEAN3_17818 covers the C-terminal sequence of the GINS protein subunit Psf1

###Gene_Info_Comments GLEAN3_18318 ###
Missing ~60aa at the end.
###Gene_Info_Comments GLEAN3_02606 ###
GLEAN3_27769 is a partial duplicate prediction for GLEAN3_02606.
###Gene_Info_Comments Sp-DNAH4 ###
The coding sequence of this gene has been assembled from exons identified in the 3/2005 Version of the S. purpuratus genome by indexing to the homologous mammalian gene, accession number AAM12861 (Morris, RL, et al., Dev. Biol. in press). The exons span multiple scaffolds (see below) in the current version of the genome and their detailed coordinates will be added to the database at a later stage. In the meantime, exon details can be obtained by email from the annotators robar@scientist.com and igibbons@berkeley.edu. Merge of scaffold78480, scaffold102459,scaffold95646 and scaffold102781. Sp-DNAH4 is 54% identical to human axonemal dynein heavy chain 3 


###Gene_Info_Comments Sp-DNAH2 ###
The coding sequence of this gene has been assembled from exons identified in the 3/2005 Version of the S. purpuratus genome by indexing to the homologous mammalian gene, accession number NP_065928 (Morris, RL, et al., Dev. Biol. in press). The exons span multiple scaffolds (see below) in the current version of the genome and their detailed coordinates will be added to the database at a later stage. In the meantime, exon details can be obtained by email from the annotators robar@scientist.com and igibbons@berkeley.edu. Merge of scaffold2032, scaffold62207, scaffold46527, scaffold22838.
###Gene_Info_Comments Sp-DNAH3 ###
The coding sequence of this gene has been assembled from exons identified in the 3/2005 Version of the S. purpuratus genome by indexing to the homologous mammalian gene, accession number NP060009 (Morris, RL, et al., Dev. Biol. in press). The exons span multiple scaffolds (see below) in the current version of the genome and their detailed coordinates will be added to the database at a later stage. In the meantime, exon details can be obtained by email from the annotators robar@scientist.com and igibbons@berkeley.edu.  Merge of scaffold108582 and scaffold87263
###Gene_Info_Comments Sp-DNAH8 ###
The coding sequence of this gene has been assembled from exons identified in the 3/2005 Version of the S. purpuratus genome by indexing to the homologous mammalian gene, accession number CAI42433(Morris, RL, et al., Dev. Biol. in press). The exons span multiple scaffolds (see below) in the current version of the genome and their detailed coordinates will be added to the database at a later stage. In the meantime, exon details can be obtained by email from the annotators robar@scientist.com and igibbons@berkeley.edu. Merge of scaffold289, scaffold56250, scaffoldi854 (3/2006 assembly), scaffold56250 
###Gene_Info_Comments Sp-DNAH14 ###
The coding sequence of this gene has been assembled from exons identified in the 3/2005 Version of the S. purpuratus genome by indexing to the homologous mammalian gene, accession number XP_943287(Morris, RL, et al., Dev. Biol. in press). The exons span multiple scaffolds (see below) in the current version of the genome and their detailed coordinates will be added to the database at a later stage. In the meantime, exon details can be obtained by email from the annotators robar@scientist.com and igibbons@berkeley.edu. Merge of scaffold86991, scaffold33286, scaffold58811, scaffold65994,  
###Gene_Info_Comments Sp-DYNC2H1 ###
The coding sequence of this gene has been assembled from exons identified in the 3/2005 Version of the S. purpuratus genome by indexing to the homologous sea urchin (Tripneustes gratilla) gene, accession number AAA63583 (Morris, RL, et al., Dev. Biol. in press). The exons span multiple scaffolds (see below) in the current version of the genome and their detailed coordinates will be added to the database at a later stage. In the meantime, exon details can be obtained by email from the annotators robar@scientist.com and igibbons@berkeley.edu. Merge of scaffold23743, scaffold_v2_46821, scaffold28672, scaffold_v2_2321, scaffold53559
###Gene_Info_Comments GLEAN3_20648 ###
Note that GLEAN3_20648 probably has the C-terminal exon of this gene.
###Gene_Info_Comments GLEAN3_20649 ###
This probably represents the C terminal exon of GLEAN3_20648, and should be combined with that model.
###Gene_Info_Comments GLEAN3_14454 ###
GLEAN3_15818 is a partial duplicate.
###Gene_Info_Comments Sp-DNAH5 ###
The coding sequence of this gene has been assembled from exons identified in the 3/2005 Version of the S. purpuratus genome by indexing to the homologous mammalian gene, accession number CAI42433 (Morris, RL, et al., Dev. Biol. in press). The exons span multiple scaffolds (see below) in the current version of the genome and their detailed coordinates will be added to the database at a later stage. In the meantime, exon details can be obtained by email from the annotators robar@scientist.com and igibbons@berkeley.edu.  Merge of scaffold1668, scaffold59332, scaffold1157, scaffold11477
###Gene_Info_Comments Sp-DNAH6 ###
The coding sequence of this gene has been assembled from exons identified in the 3/2005 Version of the S. purpuratus genome by indexing to the homologous mammalian gene, accession number XP_532984 (Morris, RL, et al., Dev. Biol. in press). The exons span multiple scaffolds (see below) in the current version of the genome and their detailed coordinates will be added to the database at a later stage. In the meantime, exon details can be obtained by email from the annotators robar@scientist.com and igibbons@berkeley.edu.  Merge of scaffold1866 and  scaffold105269
 
###Gene_Info_Comments Sp-DNAH7 ###
The coding sequence of this gene has been assembled from exons identified in the 3/2005 Version of the S. purpuratus genome by indexing to the homologous mammalian gene, accession number NP_061720 (Morris, RL, et al., Dev. Biol. in press). The exons span multiple scaffolds (see below) in the current version of the genome and their detailed coordinates will be added to the database at a later stage. In the meantime, exon details can be obtained by email from the annotators robar@scientist.com and igibbons@berkeley.edu. Merge of scaffold1775 and scaffold50668 
###Gene_Info_Comments Sp-DNAH9 ###
The coding sequence of this gene has been assembled from exons identified in the 3/2005 Version of the S. purpuratus genome by indexing to the homologous sea urchin (Tripneustes gratilla) gene, accession number CAA42170. The exons span multiple scaffolds (see below) in the current version of the genome and their detailed coordinates will be added to the database at a later stage. In the meantime, exon details can be obtained by email from the annotators robar@scientist.com and igibbons@berkeley.edu.(Morris, RL, et al., Dev. Biol. in press). Merge of scaffold27, scaffold28695
###Gene_Info_Comments Sp-DNAH10 ###
The coding sequence of this gene has been assembled from exons identified in the 3/2005 Version of the S. purpuratus genome by indexing to the homologous mammalian gene, accession number XP_543369 (Morris, RL, et al., Dev. Biol. in press). The exons span multiple scaffolds (see below) in the current version of the genome and their detailed coordinates will be added to the database at a later stage. In the meantime, exon details can be obtained by email from the annotators robar@scientist.com and igibbons@berkeley.edu.  Merge of scaffold75058 scaffold105976, scaffold102255, scaffold52292, scaffold48159
###Gene_Info_Comments Sp-DNAH12 ###
The coding sequence of this gene has been assembled from exons identified in the 3/2005 Version of the S. purpuratus genome by indexing to the homologous mammalian gene, accession number XP_541831 (Morris, RL, et al., Dev. Biol. in press). The exons span multiple scaffolds (see below) in the current version of the genome and their detailed coordinates will be added to the database at a later stage. In the meantime, exon details can be obtained by email from the annotators robar@scientist.com and igibbons@berkeley.edu. Merge of scaffold108117, scaffold694, scaffold133848
###Gene_Info_Comments Sp-DNAH15 ###
The coding sequence of this gene has been assembled from exons identified in the 3/2005 Version of the S. purpuratus genome by indexing to the homologous mammalian gene, accession number NP_001360 (Morris, RL, et al., Dev. Biol. in press). The exons span multiple scaffolds (see below) in the current version of the genome and their detailed coordinates will be added to the database at a later stage. In the meantime, exon details can be obtained by email from the annotators robar@scientist.com and igibbons@berkeley.edu. Merge of scaffold1897, scaffold128501, scaffold55352, scaffold71847, scaffold26249
###Gene_Info_Comments Sp-DYNC1H1 ###
The coding sequence of this gene has been assembled from exons identified in the 3/2005 Version of the S. purpuratus genome by indexing to the homologous mammalian gene, accession number Q14204 (Morris, RL, et al., Dev. Biol. in press). The exons span multiple scaffolds (see below) in the current version of the genome and their detailed coordinates will be added to the database at a later stage. In the meantime, exon details can be obtained by email from the annotators robar@scientist.com and igibbons@berkeley.edu.  Merge of scaffold9923, scaffold88369, scaffold88369, scaffold3805
###Gene_Info_Comments GLEAN3_14060 ###
First part of the gene model prediction is incorrect.
###Gene_Info_Comments Sp-DNAH1 ###
The coding sequence of this gene has been assembled from exons identified in the 3/2005 Version of the S. purpuratus genome by indexing to the homologous mammalian gene, accession number BAA92648 (Morris, RL, et al., Dev. Biol. in press). The exons span multiple scaffolds (see below) in the current version of the genome and their detailed coordinates will be added to the database at a later stage. In the meantime, exon details can be obtained by email from the annotators robar@scientist.com and igibbons@berkeley.edu.
Merge of scaffold60126, scaffold75782, scaffold50318, scaffold83954. Sp-DNAH1 is 63% identical to human axonemal dynein heavy chain 1
###Gene_Info_Comments GLEAN3_02458 ###
GLEAN3_21702 is a duplicate prediction.
###Gene_Info_Comments GLEAN3_21702 ###
Duplicate prediction for GLEAN3_02458
###Gene_Info_Comments GLEAN3_11152 ###
Partial duplicate of GLEAN3_06459.
###Gene_Info_Comments GLEAN3_09946 ###
Prediction is short by ~20 AA.
###Gene_Info_Comments GLEAN3_07971 ###
GLEAN3_11900 is a duplicate prediction.
###Gene_Info_Comments GLEAN3_11900 ###
GLEAN3_07971 duplicate.
###Gene_Info_Comments GLEAN3_22682 ###
Likely has an extra ~100 aa predicted towards the end.
###Gene_Info_Comments GLEAN3_06487 ###
Incorrect gene model. First ~120 AA are completely unrelated.
###Gene_Info_Comments GLEAN3_11456 ###
GLEAN3_19736 is a duplicate prediction.
###Gene_Info_Comments GLEAN3_19736 ###
GLEAN3_11456 is a duplicate prediction.
###Gene_Info_Comments GLEAN3_04087 ###
Partial duplicate prediction for GLEAN3_02438.
###Gene_Info_Comments GLEAN3_08650 ###
Incorrect gene model. Extra AA at beginning and end.
###Gene_Info_Comments GLEAN3_18079 ###
GLEAN3_04721 is a duplicate prediction.
###Gene_Info_Comments GLEAN3_13834 ###
Longer than necessary prediction.
###Gene_Info_Comments GLEAN3_08394 ###
GLEAN3_06605 has first half and GLEAN3_08394 has the rest.
###Gene_Info_Comments GLEAN3_06605 ###
GLEAN3_06605 has first half and GLEAN3_08394 has the rest.
###Gene_Info_Comments GLEAN3_10936 ###
GLEAN3_05825 is a partial duplicate prediction.
###Gene_Info_Comments GLEAN3_05825 ###
GLEAN3_05825 is a duplicate prediction for GLEAN3_10936.
###Gene_Info_Comments GLEAN3_18866 ###
Missing ~25 AA at end.
###Gene_Info_Comments GLEAN3_10692 ###
GLEAN3_26671 is a partial duplicate prediction.
###Gene_Info_Comments GLEAN3_05700 ###
GLEAN3_05700 is a partial duplicate of GLEAN3_5699.
###Gene_Info_Comments GLEAN3_11352 ###
GLEAN3_20308 and GLEAN3_01585 are partial identical predictions.
###Gene_Info_Comments GLEAN3_20308 ###
Partial duplicate prediction for GLEAN3_11352.
###Gene_Info_Comments GLEAN3_11303 ###
GLEAN3_11303 appears to have first part and GLEAN3_04643 the latter half.
###Gene_Info_Comments GLEAN3_04643 ###
GLEAN3_11303 appears to have first part and GLEAN3_04643 the latter half.
###Gene_Info_Comments GLEAN3_21496 ###
this gene was identified and published by  Nemer et al 1991 
###Gene_Info_Comments GLEAN3_24608 ###
GLEAN3_25585 appears to have partI of CAND1 and GLEAN3_24608 the other part. There is an overlap of ~50 AA.
###Gene_Info_Comments GLEAN3_25585 ###
GLEAN3_25585 appears to have partI of CAND1 and GLEAN3_24608 the other part. There is an overlap of ~50 AA.
###Gene_Info_Comments GLEAN3_15870 ###
GLEAN3_27123 is a partial duplicate prediction.
###Gene_Info_Comments GLEAN3_27123 ###
Duplicate prediction for GLEAN3_15870.
###Gene_Info_Comments GLEAN3_05722 ###
GLEAN3_023814 is a partial duplicate prediction.
###Gene_Info_Comments GLEAN3_23814 ###
Partial duplicate prediction for GLEAN3_05722.
###Gene_Info_Comments GLEAN3_24801 ###
Likely missing first exon.
###Gene_Info_Comments GLEAN3_25140 ###
GLEAN3_18792 has first part and GLEAN3_25140 has the latter half. There is some overlap in the two predictions.
###Gene_Info_Comments GLEAN3_18792 ###
GLEAN3_18792 has first part and GLEAN3_25140 has the latter half. There is some overlap in the two predictions.
###Gene_Info_Comments GLEAN3_23345 ###
GLEAN3_03864 appears to have partI. GLEAN3_23345 has partII. GLEAN3_00408 appears to have the rest of the protein.
###Gene_Info_Comments GLEAN3_03864 ###
GLEAN3_03864 appears to have partI. GLEAN3_23345 has partII. GLEAN3_00408 appears to have the rest of the protein.
###Gene_Info_Comments GLEAN3_00408 ###
GLEAN3_03864 appears to have partI. GLEAN3_23345 has partII. GLEAN3_00408 appears to have the rest of the protein.
###Gene_Info_Comments GLEAN3_10706 ###
May be missing ~100 AA at the end.
###Gene_Info_Comments GLEAN3_21755 ###
Duplicate prediction for GLEAN3_00408
###Gene_Info_Comments GLEAN3_10014 ###
GLEAN3_10014 has first part and GLEAN3_06607 has the latter half.
###Gene_Info_Comments GLEAN3_06607 ###
GLEAN3_10014 has first part and GLEAN3_06607 has the latter half.
###Gene_Info_Comments GLEAN3_18853 ###
Hit to the bestrophin homology is embedded within a longer prediction. 
###Gene_Info_Comments GLEAN3_17216 ###
GLEAN3_10982 is a partial duplicate prediction.
###Gene_Info_Comments GLEAN3_10982 ###
GLEAN3_10982 is a partial duplicate prediction.
###Gene_Info_Comments GLEAN3_17745 ###
Model is longer than necessary.
###Gene_Info_Comments GLEAN3_04715 ###
Model is likely missing a few AA at beginning.
###Gene_Info_Comments GLEAN3_01164 ###
Likely missing ~50 AA at beginning.
###Gene_Info_Comments GLEAN3_18367 ###
Prediction is short ~50 AA at beginning.
###Gene_Info_Comments GLEAN3_25476 ###
Partial duplicate prediction for GLEAN3_17333.
###Gene_Info_Comments GLEAN3_02228 ###
GLEAN3_22387 is a partial duplicate prediction.
###Gene_Info_Comments GLEAN3_22387 ###
GLEAN3_02228 is a partial duplicate prediction.
###Gene_Info_Comments GLEAN3_21649 ###
GLEAN3_18977 has part I and GLEAN3_21649 appears to have the rest of the gene.
###Gene_Info_Comments GLEAN3_18977 ###
GLEAN3_18977 has part I and GLEAN3_21649 appears to have the rest of the gene.
###Gene_Info_Comments GLEAN3_15415 ###
GLEAN3_03876 is a partial duplicate prediction.
###Gene_Info_Comments GLEAN3_03876 ###
GLEAN3_03876 is a partial duplicate prediction for GLEAN3_15415.
###Gene_Info_Comments GLEAN3_04528 ###
May be missing ~50 AA at end.
###Gene_Info_Comments GLEAN3_24347 ###
This model may contain 2 genes. Latter half is the DNA cytosine methylase (DNMT3A).
###Gene_Info_Comments GLEAN3_09023 ###
GLEAN3_12849 is a partial duplicate prediction for GLEAN3_09023.
###Gene_Info_Comments GLEAN3_12849 ###
GLEAN3_12849 is a partial duplicate prediction for GLEAN3_09023.
###Gene_Info_Comments GLEAN3_25761 ###
GLEAN3_25761 has part I. GLEAN3_20254 has part II. GLEAN3_20255 has part III. Missing a few AA at the end.
###Gene_Info_Comments GLEAN3_20255 ###
GLEAN3_25761 has part I. GLEAN3_20254 has part II. GLEAN3_20255 has part III. Missing a few AA at the end.
###Gene_Info_Comments GLEAN3_20254 ###
GLEAN3_25761 has part I. GLEAN3_20254 has part II. GLEAN3_20255 has part III. Missing a few AA at the end.
###Gene_Info_Comments GLEAN3_19071 ###
GLEAN3_19071 is a duplicate prediction for GLEAN3_24656.
###Gene_Info_Comments GLEAN3_24656 ###
GLEAN3_19071 is a duplicate prediction for GLEAN3_24656.
###Gene_Info_Comments GLEAN3_09473 ###
Missing ~200 AA at the beginning.
###Gene_Info_Comments GLEAN3_14637 ###
GLEAN3_11917 is a partial duplicate prediction for GLEAN3_14637.
###Gene_Info_Comments GLEAN3_11917 ###
GLEAN3_11917 is a partial duplicate prediction for GLEAN3_14637.
###Gene_Info_Comments GLEAN3_06672 ###
Missing ~150 AA at end.
###Gene_Info_Comments GLEAN3_11688 ###
GLEAN3_11688 is a duplicate prediction for GLEAN3_08690. It also may represent a longer incorrect model for this gene.
###Gene_Info_Comments GLEAN3_08690 ###
GLEAN3_11688 is a duplicate prediction for GLEAN3_08690. It also may represent a longer incorrect model for this gene.
###Gene_Info_Comments GLEAN3_09423 ###
GLEAN3_09422 has the first part and GLEAN3_09423 has the rest of the gene.
###Gene_Info_Comments GLEAN3_09422 ###
GLEAN3_09422 has the first part and GLEAN3_09423 has the rest of the gene.
###Gene_Info_Comments GLEAN3_25290 ###
GLEAN3_25290 is a partial duplicate prediction for GLEAN3_04522.
###Gene_Info_Comments GLEAN3_11094 ###
GLEAN3_06390 may encode for first part of NOC3L gene. GLEAN3_11094 has the rest. GLEAN3_16413 is a partial duplicate prediction for GLEAN3_11094.
###Gene_Info_Comments GLEAN3_06390 ###
GLEAN3_06390 may encode for first part of NOC3L gene. GLEAN3_11094 has the rest.
###Gene_Info_Comments GLEAN3_16413 ###
GLEAN3_06390 may encode for first part of NOC3L gene. GLEAN3_11094 has the rest. GLEAN3_16413 is a partial duplicate prediction for GLEAN3_11094.
###Gene_Info_Comments GLEAN3_03253 ###
GLEAN3_12701 has part I, GLEAN3_18955 has part II, GLEAN3_03253 has part III and GLEAN3_04821 has the last part of the gene. GLEAN3_28421 is a partial duplicate prediction for GLEAN3_04821. GLEAN3_05616 is a partial duplicate prediction for GLEAN3_18955.
###Gene_Info_Comments GLEAN3_04821 ###
GLEAN3_12701 has part I, GLEAN3_18955 has part II, GLEAN3_03253 has part III and GLEAN3_04821 has the last part of the gene. GLEAN3_28421 is a partial duplicate prediction for GLEAN3_04821. GLEAN3_05616 is a partial duplicate prediction for GLEAN3_18955.
###Gene_Info_Comments GLEAN3_12701 ###
GLEAN3_12701 has part I, GLEAN3_18955 has part II, GLEAN3_03253 has part III and GLEAN3_04821 has the last part of the gene. GLEAN3_28421 is a partial duplicate prediction for GLEAN3_04821. GLEAN3_05616 is a partial duplicate prediction for GLEAN3_18955.
###Gene_Info_Comments GLEAN3_18955 ###
GLEAN3_12701 has part I, GLEAN3_18955 has part II, GLEAN3_03253 has part III and GLEAN3_04821 has the last part of the gene. GLEAN3_28421 is a partial duplicate prediction for GLEAN3_04821. GLEAN3_05616 is a partial duplicate prediction for GLEAN3_18955.
###Gene_Info_Comments GLEAN3_28421 ###
GLEAN3_12701 has part I, GLEAN3_18955 has part II, GLEAN3_03253 has part III and GLEAN3_04821 has the last part of the gene. GLEAN3_28421 is a partial duplicate prediction for GLEAN3_04821. GLEAN3_05616 is a partial duplicate prediction for GLEAN3_18955.
###Gene_Info_Comments GLEAN3_05616 ###
GLEAN3_12701 has part I, GLEAN3_18955 has part II, GLEAN3_03253 has part III and GLEAN3_04821 has the last part of the gene. GLEAN3_28421 is a partial duplicate prediction for GLEAN3_04821. GLEAN3_05616 is a partial duplicate prediction for GLEAN3_18955.
###Gene_Info_Comments GLEAN3_28675 ###
GLEAN3_28675 is a partial duplicate prediction for GLEAN3_26740.
###Gene_Info_Comments GLEAN3_03696 ###
Missing ~150 AA at beginning.
###Gene_Info_Comments GLEAN3_09849 ###
May be missing ~25 AA be beginning.
###Gene_Info_Comments GLEAN3_04974 ###
GLEAN3_04974 is a duplicate prediction for GLEAN3_06042.
###Gene_Info_Comments GLEAN3_14021 ###
GLEAN3_14021 is a partial duplicate prediction for GLEAN3_13399.
###Gene_Info_Comments GLEAN3_07111 ###
GLEAN3_07455 is a duplicate longer prediction.
###Gene_Info_Comments GLEAN3_00285 ###
GLEAN3_00285 has most of the gene except the last part which appears to be on GLEAN3_09456. There are errors in the prediction at the end of GLEAN3_00285 and at beginning of GLEAN3_09456 which perfectly match with the COPZ1 gene.
###Gene_Info_Comments GLEAN3_09546 ###
GLEAN3_00285 has most of the gene except the last part which appears to be on GLEAN3_09456. There are errors in the prediction at the end of GLEAN3_00285 and at beginning of GLEAN3_09456 which perfectly match with the COPZ1 gene.
###Gene_Info_Comments GLEAN3_12559 ###
GLEAN3_12559 and GLEAN3_26314 both contain the AP2S1 gene but both are erroneous long predictions. Beginning of the predictions do not code for any significant proteins in DB.
###Gene_Info_Comments GLEAN3_26314 ###
GLEAN3_12559 and GLEAN3_26314 both contain the AP2S1 gene but both are erroneous long predictions. Beginning of the predictions do not code for any significant proteins in DB.
###Gene_Info_Comments GLEAN3_18163 ###
GLEAN3_18163 and GLEAN3_16841 are partially duplicate predictions.
###Gene_Info_Comments GLEAN3_16841 ###
GLEAN3_18163 and GLEAN3_16841 are partially duplicate predictions.
###Gene_Info_Comments GLEAN3_02616 ###
GLEAN3_18244 is a partial duplicate prediction for GLEAN3_02616.
###Gene_Info_Comments GLEAN3_18244 ###
GLEAN3_18244 is a partial duplicate prediction for GLEAN3_02616.
###Gene_Info_Comments GLEAN3_01148 ###
GLEAN3_01148 - GLEAN3_03209 - GLEAN3_09645 represent three parts of the CLPTM1 gene, perhaps in that order (and with overlap). It is difficult to be certain as only parts of GLEAN3_03209 and GLEAN3_09645 are homologous to CLPTM1.
###Gene_Info_Comments GLEAN3_03209 ###
GLEAN3_01148 - GLEAN3_03209 - GLEAN3_09645 represent three parts of the CLPTM1 gene, perhaps in that order (and with overlap). It is difficult to be certain as only parts of GLEAN3_03209 and GLEAN3_09645 are homologous to CLPTM1.
###Gene_Info_Comments GLEAN3_09645 ###
GLEAN3_01148 - GLEAN3_03209 - GLEAN3_09645 represent three parts of the CLPTM1 gene, perhaps in that order (and with overlap). It is difficult to be certain as only parts of GLEAN3_03209 and GLEAN3_09645 are homologous to CLPTM1.
###Gene_Info_Comments GLEAN3_00731 ###
GLEAN3_00731 and GLEAN3_07244 represent two parts of the CLN3 protein. There is perhaps ~50 AA missing from GLEAN3_07244 model.
###Gene_Info_Comments GLEAN3_07244 ###
GLEAN3_00731 and GLEAN3_07244 represent two parts of the CLN3 protein. There is perhaps ~50 AA missing from GLEAN3_07244 model.
###Gene_Info_Comments GLEAN3_01258 ###
GLEAN3_10663 is a duplicate prediction for GLEAN3_01258.
###Gene_Info_Comments GLEAN3_10663 ###
GLEAN3_10663 is a duplicate prediction for GLEAN3_01258.
###Gene_Info_Comments GLEAN3_10747 ###
GLEAN3_25121 is a duplicate prediction for GLEAN3_10747.
###Gene_Info_Comments GLEAN3_25121 ###
GLEAN3_25121 is a duplicate prediction for GLEAN3_10747.
###Gene_Info_Comments GLEAN3_21311 ###
This gene model is incomplete. Missing ~350 AA at the end. GLEAN3_14811 MAY represent the missing half. It does show COPB2 as the best hit in searches against DB.
###Gene_Info_Comments GLEAN3_23769 ###
Missing ~100 AA at end.
###Gene_Info_Comments GLEAN3_09246 ###
Model is incorrect.
###Gene_Info_Comments GLEAN3_26357 ###
GLEAN3_03524 is a partial duplicate prediction for GLEAN3_26357.
###Gene_Info_Comments GLEAN3_03524 ###
GLEAN3_03524 is a partial duplicate prediction for GLEAN3_26357.
###Gene_Info_Comments GLEAN3_24920 ###
GLEAN3_04667 is a partial duplicate prediction for GLEAN3_24920.
###Gene_Info_Comments GLEAN3_04667 ###
GLEAN3_04667 is a partial duplicate prediction for GLEAN3_24920.
###Gene_Info_Comments GLEAN3_02885 ###
Has an additional ~45 AA at beginning of protein.
###Gene_Info_Comments GLEAN3_07188 ###
GLEAN3_07188 has the first half of DDB1 and GLEAN3_07379 has the rest. A few AA may be missing at the beginning.
###Gene_Info_Comments GLEAN3_07379 ###
GLEAN3_07188 has the first half of DDB1 and GLEAN3_07379 has the rest. A few AA may be missing at the beginning.
###Gene_Info_Comments GLEAN3_09266 ###
GLEAN3_09266 is a partially duplicate prediction for GLEAN3_01826.
###Gene_Info_Comments GLEAN3_10339 ###
GLEAN3_07300 is a partial duplicate prediction for GLEAN3_10339.
###Gene_Info_Comments GLEAN3_07300 ###
GLEAN3_07300 is a partial duplicate prediction for GLEAN3_10339.
###Gene_Info_Comments GLEAN3_19786 ###
GLEAN3_19786 is a partial duplicate ptediction for GLEAN3_13953.
###Gene_Info_Comments GLEAN3_04860 ###
GLEAN3_24432 is a partial duplicate prediction for GLEAN3_04860.
###Gene_Info_Comments GLEAN3_24432 ###
GLEAN3_24432 is a partial duplicate prediction for GLEAN3_04860.
###Gene_Info_Comments GLEAN3_03593 ###
GLEAN3_03593 has the first half of FBXL10 gene. GLEAN3_025153 MAY encode the other half. GLEAN3_02864 is a partial duplicate prediction for GLEAN3_03593.
###Gene_Info_Comments GLEAN3_25153 ###
GLEAN3_03593 has the first half of FBXL10 gene. GLEAN3_025153 MAY encode the other half. GLEAN3_02864 is a partial duplicate prediction for GLEAN3_03593.
###Gene_Info_Comments GLEAN3_02864 ###
GLEAN3_03593 has the first half of FBXL10 gene. GLEAN3_025153 MAY encode the other half. GLEAN3_02864 is a partial duplicate prediction for GLEAN3_03593.
###Gene_Info_Comments GLEAN3_15361 ###
Contains a leucine-rich repeat (LRR-R1) domain.
###Gene_Info_Comments GLEAN3_05729 ###
GLEAN3_05729 and GLEAN3_08183 are similar duplicate predictions.
###Gene_Info_Comments GLEAN3_08183 ###
GLEAN3_05729 and GLEAN3_08183 are similar duplicate predictions.
###Gene_Info_Comments GLEAN3_24772 ###
The gene model on the N-terminal end appears to be incorrect.
###Gene_Info_Comments GLEAN3_07722 ###
GLEAN3_07722 encodes the complete CYFIP2 gene. GLEAN3_07272 and GLEAN3_11333 are partial duplicate predictions for GLEAN3_07722.
###Gene_Info_Comments GLEAN3_07272 ###
GLEAN3_07722 encodes the complete CYFIP2 gene. GLEAN3_07272 and GLEAN3_11333 are partial duplicate predictions for GLEAN3_07722.
###Gene_Info_Comments GLEAN3_11333 ###
GLEAN3_07722 encodes the complete CYFIP2 gene. GLEAN3_07272 and GLEAN3_11333 are partial duplicate predictions for GLEAN3_07722.
###Gene_Info_Comments GLEAN3_17645 ###
GLEAN3_12959 is a partial duplicate prediction for GLEAn3_17645.
###Gene_Info_Comments GLEAN3_12959 ###
GLEAN3_12959 is a partial duplicate prediction for GLEAn3_17645.
###Gene_Info_Comments GLEAN3_00487 ###
GLEAN3_00487 has the first half of the gene. GLEAN3_27232 MAY have the rest.
###Gene_Info_Comments GLEAN3_27232 ###
GLEAN3_00487 has the first half of the gene. GLEAN3_27232 MAY have the rest.
###Gene_Info_Comments GLEAN3_03575 ###
GLEAN3_03575 and GLEAN3_11118 are partial duplicate predictions for GLEAN3_27232.
###Gene_Info_Comments GLEAN3_11118 ###
GLEAN3_03575 and GLEAN3_11118 are partial duplicate predictions for GLEAN3_27232.
###Gene_Info_Comments GLEAN3_04739 ###
GLEAN3_04739 is a partial duplicate prediction for GLEAN3_12301.
###Gene_Info_Comments GLEAN3_12301 ###
GLEAN3_04739 is a partial duplicate prediction for GLEAN3_12301.
###Gene_Info_Comments GLEAN3_09085 ###
Likely isoform of DHPS.
###Gene_Info_Comments GLEAN3_17680 ###
Likely isoform of DHPS.
###Gene_Info_Comments GLEAN3_22098 ###
Likely isoform of DHPS.
###Gene_Info_Comments GLEAN3_25930 ###
Duplicate prediction for GLEAN3_12301.
###Gene_Info_Comments GLEAN3_00231 ###
GLEAN3_05915 has part II of this gene. GLEAN3_27703 MAY encode the first part. GLEAN3_00273 is a duplicate prediction for GLEAN3_27703.
###Gene_Info_Comments GLEAN3_05915 ###
GLEAN3_05915 has part II of this gene. GLEAN3_27703 MAY encode the first part. GLEAN3_00273 is a duplicate prediction for GLEAN3_27703.
###Gene_Info_Comments GLEAN3_27703 ###
GLEAN3_05915 has part II of this gene. GLEAN3_27703 MAY encode the first part. GLEAN3_00273 is a duplicate prediction for GLEAN3_27703.
###Gene_Info_Comments GLEAN3_06485 ###
GLEAN3_06485 and GLEAN3_08534 are duplicate predictions.
###Gene_Info_Comments GLEAN3_08534 ###
GLEAN3_06485 and GLEAN3_08534 are duplicate predictions.
###Gene_Info_Comments GLEAN3_18067 ###
GLEAN3_21911 is a duplicate prediction for GLEAN3_18067.
###Gene_Info_Comments GLEAN3_21911 ###
GLEAN3_21911 is a duplicate prediction for GLEAN3_18067.
###Gene_Info_Comments GLEAN3_10109 ###
Similar to both ZDHHC2 or ZDHHC20.
###Gene_Info_Comments GLEAN3_17163 ###
GLEAN3_17163 has part I and GLEAN3_17164 has part II of ZDHHC7 gene.
###Gene_Info_Comments GLEAN3_17164 ###
GLEAN3_17163 has part I and GLEAN3_17164 has part II of ZDHHC7 gene.
###Gene_Info_Comments GLEAN3_21834 ###
GLEAN3_21833 has part I and GLEAN3_21834 has part II of ZDHHC17 gene.

###Gene_Info_Comments GLEAN3_21833 ###
GLEAN3_21833 has part I and GLEAN3_21834 has part II of ZDHHC17 gene.

###Gene_Info_Comments GLEAN3_12476 ###
This gene blasts back to the mannose receptor for a VERY large range of animals from Human to Danio.
###Gene_Info_Comments GLEAN3_06842 ###
GLEAN3_27068 is a duplicate prediction for GLEAN3_06842.
###Gene_Info_Comments GLEAN3_27068 ###
GLEAN3_27068 is a duplicate prediction for GLEAN3_06842.
###Gene_Info_Comments GLEAN3_12025 ###
GLEAN3_12025 has the first half of the gene. GLEAN3_22214 has the rest.
###Gene_Info_Comments GLEAN3_22214 ###
GLEAN3_12025 has the first half of the gene. GLEAN3_22214 has the rest.
###Gene_Info_Comments GLEAN3_09348 ###
Likely missing latter half of LIN9 gene. GLEAN3_08517 MAY code for the missing half.
###Gene_Info_Comments GLEAN3_02194 ###
GLEAN3_27247 is a partial duplicate prediction for GLEAN3_02194.
###Gene_Info_Comments GLEAN3_27247 ###
GLEAN3_27247 is a partial duplicate prediction for GLEAN3_02194.
###Gene_Info_Comments GLEAN3_04401 ###
May be missing an exon at the beginning.
###Gene_Info_Comments GLEAN3_07535 ###
GLEAN3_07535 contains first part of the gene and GLEAN3_04388 has the later half. There is some overlap between the two models.
###Gene_Info_Comments GLEAN3_04388 ###
GLEAN3_07535 contains first part of the gene and GLEAN3_04388 has the later half. There is some overlap between the two models.
###Gene_Info_Comments GLEAN3_15460 ###
GLEAN3_15460 is duplicate prediction for GLEAN3_09441.
###Gene_Info_Comments GLEAN3_09441 ###
GLEAN3_15460 is duplicate prediction for GLEAN3_09441.
###Gene_Info_Comments GLEAN3_15152 ###
GLEAN3_04341 is a partial duplicate prediction for GLEAN3_15152.
###Gene_Info_Comments GLEAN3_04341 ###
GLEAN3_04341 is a partial duplicate prediction for GLEAN3_15152.
###Gene_Info_Comments GLEAN3_28370 ###
GLEAN3_28370 codes for first part of DNAH1 protein and GLEAN3_00013 codes for the rest. There may be an exon missing in between the two predictions.
###Gene_Info_Comments GLEAN3_00013 ###
GLEAN3_28370 codes for first part of DNAH1 protein and GLEAN3_00013 codes for the rest. There may be an exon missing in between the two predictions.
###Gene_Info_Comments GLEAN3_03564 ###
GLEAN3_15049 has first part of the DNAH7 gene and GLEAN3_03564 has the rest.
###Gene_Info_Comments GLEAN3_15049 ###
GLEAN3_15049 has first part of the DNAH7 gene and GLEAN3_03564 has the rest.
###Gene_Info_Comments GLEAN3_27326 ###
GLEAN3_27326 may have the first part of the gene. Rest is coded by GLEAN3_17271.
###Gene_Info_Comments GLEAN3_17271 ###
GLEAN3_27326 may have the first part of the gene. Rest is coded by GLEAN3_17271.
###Gene_Info_Comments GLEAN3_24529 ###
GLEAN3_24529 has the first part of DNAH5 gene and GLEAN3_03660 has the latter. There is overlap between the two GLEAN models (~2161-2763 AA).
###Gene_Info_Comments GLEAN3_03660 ###
GLEAN3_24529 has the first part of DNAH5 gene and GLEAN3_03660 has the latter. There is overlap between the two GLEAN models (~2161-2763 AA).
###Gene_Info_Comments GLEAN3_07564 ###
Missing the first half.
###Gene_Info_Comments GLEAN3_02335 ###
GLEAN3_02335 has most of the first part. GLEAN3_24805 codes for the rest. It is possible that these two GLEAN's may be partial predictions of two different genes.
###Gene_Info_Comments GLEAN3_24805 ###
GLEAN3_02335 has most of the first part. GLEAN3_24805 codes for the rest. It is possible that these two GLEAN's may be partial predictions of two different genes.
###Gene_Info_Comments GLEAN3_20931 ###
GLEAN3_10157 and GLEAN3_20427 are partial duplicate predictions for GLEAN3_20931.
###Gene_Info_Comments GLEAN3_10157 ###
GLEAN3_10157 and GLEAN3_20427 are partial duplicate predictions for GLEAN3_20931.
###Gene_Info_Comments GLEAN3_20427 ###
GLEAN3_10157 and GLEAN3_20427 are partial duplicate predictions for GLEAN3_20931.
###Gene_Info_Comments GLEAN3_17547 ###
Incomplete gene model.
###Gene_Info_Comments GLEAN3_22833 ###
GLEAN3_22833 is a duplicate prediction for GLEAN3_17547.
###Gene_Info_Comments GLEAN3_08312 ###
GLEAN3_28010 is a duplicate prediction for GLEAN3_08312.
###Gene_Info_Comments GLEAN3_28010 ###
GLEAN3_28010 is a duplicate prediction for GLEAN3_08312.
###Gene_Info_Comments GLEAN3_22907 ###
Gene model incorrect. Too long.
###Gene_Info_Comments GLEAN3_27407 ###
GLEAN3_27407 is a partial duplicate prediction for GLEAN3_27406.
###Gene_Info_Comments GLEAN3_27406 ###
GLEAN3_27407 is a partial duplicate prediction for GLEAN3_27406.
###Gene_Info_Comments GLEAN3_04463 ###
GLEAN3_25916 is a partial duplicate prediction for GLEAN3_04463.
###Gene_Info_Comments GLEAN3_25916 ###
GLEAN3_25916 is a partial duplicate prediction for GLEAN3_04463.
###Gene_Info_Comments GLEAN3_26427 ###
GLEAN3_13244 is a partial duplicate prediction for GLEAN3_26427.
###Gene_Info_Comments GLEAN3_13244 ###
GLEAN3_13244 is a partial duplicate prediction for GLEAN3_26427.
###Gene_Info_Comments GLEAN3_10328 ###
GLEAN3_26065 is a partial duplicate prediction for GLEAN3_10328.
###Gene_Info_Comments GLEAN3_26065 ###
GLEAN3_26065 is a partial duplicate prediction for GLEAN3_10328.
###Gene_Info_Comments GLEAN3_10620 ###
Larger than required prediction.
###Gene_Info_Comments GLEAN3_23970 ###
GLEAN3_23970 is a partial duplicate prediction for GLEAN3_19111.
###Gene_Info_Comments GLEAN3_19111 ###
GLEAN3_23970 is a partial duplicate prediction for GLEAN3_19111.
###Gene_Info_Comments GLEAN3_08304 ###
GLEAN3_08304 may represent a partial prediction for MYST2 (which is largely encoded by GLEAN3_17172, but is missing a piece that is present in GLEAN3_08304).
###Gene_Info_Comments GLEAN3_17172 ###
GLEAN3_08304 may represent a partial prediction for MYST2 (which is largely encoded by GLEAN3_17172, but is missing a piece that is present in GLEAN3_08304).
###Gene_Info_Comments GLEAN3_09734 ###
Missing ~70 AA at beginning.
###Gene_Info_Comments GLEAN3_14886 ###
GLEAN3_14887 is a longer duplicate prediction for GLEAN3_14886.
###Gene_Info_Comments GLEAN3_14887 ###
GLEAN3_14887 is a longer duplicate prediction for GLEAN3_14886.
###Gene_Info_Comments GLEAN3_11305 ###
Prediction likely short by ~50 AA at beginning.
###Gene_Info_Comments GLEAN3_13192 ###
GLEAN3_13192 is a partial duplicate of GLEAN3_19276.
###Gene_Info_Comments GLEAN3_13812 ###
GLEAN3_13812 encodes Part I of XRN1 and GLEAN3_13813 probably encodes the rest (though may be partially incorrect).
###Gene_Info_Comments GLEAN3_19276 ###
GLEAN3_19276 is a partial prediction that may be missing ~180-200 AA at end
###Gene_Info_Comments GLEAN3_13813 ###
GLEAN3_13812 encodes Part I of XRN1 and GLEAN3_13813 probably encodes the rest (though may be partially incorrect).
###Gene_Info_Comments GLEAN3_21007 ###
GLEAN3_21007 may be missing a few AA at the end. GLEAN3_24819 is a partial duplicate prediction for GLEAN3_21007. GLEAN3_00416 may be as well.
###Gene_Info_Comments GLEAN3_24819 ###
GLEAN3_21007 may be missing a few AA at the end. GLEAN3_24819 is a partial duplicate prediction for GLEAN3_21007. GLEAN3_00416 may be as well.
###Gene_Info_Comments GLEAN3_00461 ###
GLEAN3_21007 may be missing a few AA at the end. GLEAN3_24819 is a partial duplicate prediction for GLEAN3_21007. GLEAN3_00416 may be as well.
###Gene_Info_Comments GLEAN3_04023 ###
GLEAN3_04023 is a partial duplicate prediction for GLEAN3_11694.
###Gene_Info_Comments GLEAN3_11694 ###
GLEAN3_04023 is a partial duplicate prediction for GLEAN3_11694.
###Gene_Info_Comments GLEAN3_04608 ###
GLEAN3_04608 is a duplicate prediction for GLEAN3_04445.
###Gene_Info_Comments GLEAN3_04445 ###
GLEAN3_04608 is a duplicate prediction for GLEAN3_04445.
###Gene_Info_Comments GLEAN3_14305 ###
GLEAN3_27401 is a partial duplicate prediction for GLEAN3_14305.
###Gene_Info_Comments GLEAN3_27401 ###
GLEAN3_27401 is a partial duplicate prediction for GLEAN3_14305.
###Gene_Info_Comments GLEAN3_23752 ###
Prediction is incorrect .. much longer than required.
###Gene_Info_Comments GLEAN3_26862 ###
GLEAN3_26862 has the first part of the FTSJ3 gene and GELAN3_21738 likely has the rest. A couple of exons are likely missing - one at the beginning and one between the two GLEAN's.
###Gene_Info_Comments GLEAN3_21738 ###
GLEAN3_26862 has the first part of the FTSJ3 gene and GELAN3_21738 likely has the rest. A couple of exons are likely missing - one at the beginning and one between the two GLEAN's.
###Gene_Info_Comments GLEAN3_17569 ###
Likely missing last exon.
###Gene_Info_Comments GLEAN3_20191 ###
GLEAN3_20191 is a partial duplicate prediction for GLEAN3_01541.
###Gene_Info_Comments GLEAN3_01541 ###
GLEAN3_20191 is a partial duplicate prediction for GLEAN3_01541.
###Gene_Info_Comments GLEAN3_23204 ###
GLEAN3_07955 is a duplicate prediction for GLEAN3_23204.
###Gene_Info_Comments GLEAN3_07955 ###
GLEAN3_07955 is a duplicate prediction for GLEAN3_23204.
###Gene_Info_Comments GLEAN3_11958 ###
GLEAN3_11958 has most of the gene except the last exon or two which are encoded by GLEAN3_08969. There is a significant overlap between the two GLEAN's.
###Gene_Info_Comments GLEAN3_08969 ###
GLEAN3_11958 has most of the gene except the last exon or two which are encoded by GLEAN3_08969. There is a significant overlap between the two GLEAN's.
###Gene_Info_Comments GLEAN3_04785 ###
GLEAN3_06645 and GLEAN3_10748 are partial duplicate predictions for GLEAN3_04785.
###Gene_Info_Comments GLEAN3_06645 ###
GLEAN3_06645 and GLEAN3_10748 are partial duplicate predictions for GLEAN3_04785.
###Gene_Info_Comments GLEAN3_10748 ###
GLEAN3_06645 and GLEAN3_10748 are partial duplicate predictions for GLEAN3_04785.
###Gene_Info_Comments GLEAN3_11454 ###
Intronless Toll-like receptor with predicted secretory signal peptide, LRR-NT, LRR(22), LRR-CT, TM and TIR.  
This is a member of sea urchin-specific Tlr Group IA.
###Gene_Info_Comments GLEAN3_12464 ###
The intron of this gene model included a long unknown sequence. So the model was modified as an intronless gene by comparison to Fgenesh++ and NCBI prediction.
This is a member of sea urchin-specific Tlr Group IB. 
###Gene_Info_Comments GLEAN3_27226 ###
This model is incorrect. It encodes the MTO1 gene in the first part but the latter half appears to be similar to GLIPR1L1(NP_689992).
###Gene_Info_Comments GLEAN3_13160 ###
This GLEAN contains N-terminal region of the full-length Mdn1 gene, corresponding to amino acids 1 to ~1532 of human midasin. The mid rgion of the Sp-MDN1 gene is in GLEAN3_09702 and the C-terminal region is GLEAN3_22614. The MDN1 gene contains a conserved MIDAS domain COG5271.2 and a hexomeric AAA ATPase domain distantly related to that of dynein
###Gene_Info_Comments GLEAN3_09702 ###
This GLEAN contains the mid-region of the full length midasin polypeptide, corresponding to approximately amino acids 2660-3943 of human midasin. The N-terminal region is GLEAN3_13160 and the C-terminal region is GLEAN3_22614
###Gene_Info_Comments GLEAN3_22908 ###
Likely missing an exon.
###Gene_Info_Comments GLEAN3_23058 ###
Likely missing an exon (~30 AA).
###Gene_Info_Comments GLEAN3_02179 ###
Missing ~100 AA.
###Gene_Info_Comments GLEAN3_24540 ###
GLEAN3_13159 is a partial duplicate prediction for GLEAN3_24540.
###Gene_Info_Comments GLEAN3_13159 ###
GLEAN3_13159 is a partial duplicate prediction for GLEAN3_24540.
###Gene_Info_Comments GLEAN3_17807 ###
GLEAN3_17807 is a partial duplicate prediction for GLEAN3_18692.
###Gene_Info_Comments GLEAN3_28576 ###
Intronless Toll-like receptor with predicted secretory signal peptide, LRR-NT, LRR(22), LRR-CT, TM and TIR.
This is a member of sea urchin-specific Tlr Group I(orphan).
###Gene_Info_Comments GLEAN3_07429 ###
Intronless Toll-like receptor with predicted secretory signal peptide, LRR-NT, LRR(22), LRR-CT, TM and TIR. 
This is a member of sea urchin-specific Tlr Group IC. 
###Gene_Info_Comments GLEAN3_11570 ###
Intronless Toll-like receptor with predicted secretory signal peptide, LRR-NT, LRR(22), LRR-CT, TM and TIR. This is a member of sea urchin-specific Tlr Group IC. 
###Gene_Info_Comments GLEAN3_26347 ###
This GLEAN forms part of the annotated full-length Sp-DNAH1 gene (SPU_030223)
###Gene_Info_Comments GLEAN3_11876 ###
This GLEAN forms part of the annotated full-length Sp-DNAH10 gene (SPU_030231)
###Gene_Info_Comments GLEAN3_15054 ###
This Glean forms part of the annotated full-length Sp-DNAH10 gene (SPU_030231)
###Gene_Info_Comments GLEAN3_28362 ###
This glean forms part of the annotated full-length Sp-DNAH10 gene (SPU_030231)
###Gene_Info_Comments GLEAN3_26539 ###
This GLEAN forms part of the annotated full length Sp-DNAH10 -gene (SPU_030231)
###Gene_Info_Comments GLEAN3_28189 ###
This Glewan forms part of the annotated full-length Sp-DNAH10 gene (SPU_030231)
###Gene_Info_Comments GLEAN3_02747 ###
This Glean forms part of the annotated full length Sp-DNAH12 gene (SPU_030232)
###Gene_Info_Comments GLEAN3_05317 ###
This Glean forms part of the annotated full length Sp-DNAH12 gene (SPU_030232)
###Gene_Info_Comments GLEAN3_20747 ###
This Glean forms part of the annotated full-length Sp-DNAH12 gene (SPU_030232)
###Gene_Info_Comments GLEAN3_02750 ###
This Glean forms part of the annotated full length Sp-DNAH14 gene (SPU_030233)
###Gene_Info_Comments GLEAN3_28434 ###
This glean forms part of the annotated full-length Sp-DNAH14 gene (SPU_030233)
###Gene_Info_Comments GLEAN3_08188 ###
This Glean forms part of the annotated full length Sp-DNAH14 gene (SPU_030233)
###Gene_Info_Comments GLEAN3_12139 ###
This Glean forms part of the annotated full length Sp-DNAH14 gene (SPU_030233)
###Gene_Info_Comments GLEAN3_16536 ###
Intronless Toll-like receptor with predicted secretory signal peptide, LRR-NT, LRR(22), LRR-CT, TM and TIR.  
This is a member of sea urchin-specific Tlr Group ID.
###Gene_Info_Comments GLEAN3_26081 ###
This Glean is part of the annotated full-length Sp-DNAH15 gene (SPU_030234)
###Gene_Info_Comments GLEAN3_17935 ###
This glean is part of the annotated full-length Sp-DNAH15 gene (SPU_030234)
###Gene_Info_Comments GLEAN3_12417 ###
This Glean is part of the annotated full length Sp_DNAH2 gene (SPU_030224)
###Gene_Info_Comments GLEAN3_18822 ###
This Glean is part of the annotated full-length Sp-DNAH2 gene (SPU_030224)
###Gene_Info_Comments GLEAN3_10136 ###
This Glean is part of the annotated full length Sp-DNAH2 gene (SPU_030224)
###Gene_Info_Comments GLEAN3_18784 ###
This Glean is part of the annotated full-length Sp-DNAH2 gene (SPU_030224)
###Gene_Info_Comments GLEAN3_27159 ###
This Glean forms part of the annotated full-length Sp-DNAH3 gene (SPU_030236)
###Gene_Info_Comments GLEAN3_26238 ###
GLEAN3_12397 is a partial duplicate prediction for GLEAN3_26238.
###Gene_Info_Comments GLEAN3_12397 ###
GLEAN3_12397 is a partial duplicate prediction for GLEAN3_26238.
###Gene_Info_Comments GLEAN3_04548 ###
Missing an exon in middle.
###Gene_Info_Comments GLEAN3_05987 ###
GLEAN3_05987 is a composite prediction of two genes. Sp-NUC205 and the last ~350 AA code for Sp-PAQR5.
GLEAN3_15468 and GLEAN3_21249 are partial duplicate predictions for GLEAN3_05987.
###Gene_Info_Comments GLEAN3_15468 ###
GLEAN3_15468 and GLEAN3_21249 are partial duplicate predictions for GLEAN3_05987.
###Gene_Info_Comments GLEAN3_21249 ###
GLEAN3_15468 and GLEAN3_21249 are partial duplicate predictions for GLEAN3_05987.
###Gene_Info_Comments GLEAN3_10384 ###
Missing first exon.
###Gene_Info_Comments GLEAN3_12748 ###
GLEAN3_12748 is a partial duplicate prediction for GLEAN3_18535.
###Gene_Info_Comments GLEAN3_27020 ###
Incorrect gene model. Extra exons.
###Gene_Info_Comments GLEAN3_15720 ###
Likely has an extra exon.
GLEAN3_02618 and GLEAN3_22235 are partial duplicate predictions for GLEAN3_15720.
###Gene_Info_Comments GLEAN3_22235 ###
GLEAN3_02618 and GLEAN3_22235 are partial duplicate predictions for GLEAN3_15720.
###Gene_Info_Comments GLEAN3_02618 ###
GLEAN3_02618 and GLEAN3_22235 are partial duplicate predictions for GLEAN3_15720.
###Gene_Info_Comments GLEAN3_24386 ###
This gene model may represent a pseudogene or contain a sequence error. Intron sequence matches coding sequence of other Sp-Tlr genes and contains stop codons. Modified gene model has some frame shifts, but reflects best gene structure.
This is a member of sea urchin-specific Tlr Group IE.

###Gene_Info_Comments GLEAN3_17828 ###
A number of other GLEAN's may be partially similar to Sacsin. These may include GLEAN3_08522, 15037, 21335, 28108, 25360, 04170 and 00614.
GLEAN3_17828,_08522 and _25360 are very simlar and form one group.
GLEAN3_04170, _28108, _15037, _21335, _00614, _06655, _16709 and _22759 form a second group of sacsin like genes.
###Gene_Info_Comments GLEAN3_25360 ###
A number of other GLEAN's may be partially similar to Sacsin. These may include GLEAN3_08522, 15037, 21335, 28108, 25360, 04170 and 00614.
GLEAN3_17828,_08522 and _25360 are very simlar and form one group.
GLEAN3_04170, _28108, _15037, _21335, _00614, _06655, _16709 and _22759 form a second group of sacsin like genes.
###Gene_Info_Comments GLEAN3_08522 ###
A number of other GLEAN's may be partially similar to Sacsin. These may include GLEAN3_08522, 15037, 21335, 28108, 25360, 04170 and 00614.
GLEAN3_17828,_08522 and _25360 are very simlar and form one group.
GLEAN3_04170, _28108, _15037, _21335, _00614, _06655, _16709 and _22759 form a second group of sacsin like genes.
###Gene_Info_Comments GLEAN3_00614 ###
A number of other GLEAN's may be partially similar to Sacsin. These may include GLEAN3_08522, 15037, 21335, 28108, 25360, 04170 and 00614.
GLEAN3_17828,_08522 and _25360 are very simlar and form one group.
GLEAN3_04170, _28108, _15037, _21335, _00614, _06655, _16709 and _22759 form a second group of sacsin like genes.
###Gene_Info_Comments GLEAN3_04170 ###
A number of other GLEAN's may be partially similar to Sacsin. These may include GLEAN3_08522, 15037, 21335, 28108, 25360, 04170 and 00614.
GLEAN3_17828,_08522 and _25360 are very simlar and form one group.
GLEAN3_04170, _28108, _15037, _21335, _00614, _06655, _16709 and _22759 form a second group of sacsin like genes.
###Gene_Info_Comments GLEAN3_28108 ###
A number of other GLEAN's may be partially similar to Sacsin. These may include GLEAN3_08522, 15037, 21335, 28108, 25360, 04170 and 00614.
GLEAN3_17828,_08522 and _25360 are very simlar and form one group.
GLEAN3_04170, _28108, _15037, _21335, _00614, _06655, _16709 and _22759 form a second group of sacsin like genes.
###Gene_Info_Comments GLEAN3_15037 ###
A number of other GLEAN's may be partially similar to Sacsin. These may include GLEAN3_08522, 15037, 21335, 28108, 25360, 04170 and 00614.
GLEAN3_17828,_08522 and _25360 are very simlar and form one group.
GLEAN3_04170, _28108, _15037, _21335, _00614, _06655, _16709 and _22759 form a second group of sacsin like genes.
###Gene_Info_Comments GLEAN3_21335 ###
A number of other GLEAN's may be partially similar to Sacsin. These may include GLEAN3_08522, 15037, 21335, 28108, 25360, 04170 and 00614.
GLEAN3_17828,_08522 and _25360 are very simlar and form one group.
GLEAN3_04170, _28108, _15037, _21335, _00614, _06655, _16709 and _22759 form a second group of sacsin like genes.
###Gene_Info_Comments GLEAN3_06655 ###
A number of other GLEAN's may be partially similar to Sacsin. These may include GLEAN3_08522, 15037, 21335, 28108, 25360, 04170 and 00614.
GLEAN3_17828,_08522 and _25360 are very simlar and form one group.
GLEAN3_04170, _28108, _15037, _21335, _00614, _06655, _16709 and _22759 form a second group of sacsin like genes.
###Gene_Info_Comments GLEAN3_16709 ###
A number of other GLEAN's may be partially similar to Sacsin. These may include GLEAN3_08522, 15037, 21335, 28108, 25360, 04170 and 00614.
GLEAN3_17828,_08522 and _25360 are very simlar and form one group.
GLEAN3_04170, _28108, _15037, _21335, _00614, _06655, _16709 and _22759 form a second group of sacsin like genes.
###Gene_Info_Comments GLEAN3_22759 ###
A number of other GLEAN's may be partially similar to Sacsin. These may include GLEAN3_08522, 15037, 21335, 28108, 25360, 04170 and 00614.
GLEAN3_17828,_08522 and _25360 are very simlar and form one group.
GLEAN3_04170, _28108, _15037, _21335, _00614, _06655, _16709 and _22759 form a second group of sacsin like genes.
###Gene_Info_Comments GLEAN3_15550 ###
Likely incorrect. Longer than required.
###Gene_Info_Comments GLEAN3_15087 ###
GLEAN3_15087 has the second half of TSR1 gene. GLEAN3_19664 likely codes for the first half.
###Gene_Info_Comments GLEAN3_19664 ###
GLEAN3_15087 has the second half of TSR1 gene. GLEAN3_19664 likely codes for the first half.
###Gene_Info_Comments GLEAN3_27482 ###
Longer than required prediction.
###Gene_Info_Comments GLEAN3_05961 ###
Missing one or more exons at beginning (~80 AA).
###Gene_Info_Comments GLEAN3_05268 ###
GLEAN3_20117 is a duplicate prediction for GLEAN3_05268.
###Gene_Info_Comments GLEAN3_20117 ###
GLEAN3_20117 is a duplicate prediction for GLEAN3_05268.
###Gene_Info_Comments GLEAN3_22232 ###
GLEAN3_22232 is a partial duplicate prediction for GLEAN3_13282.
###Gene_Info_Comments GLEAN3_02056 ###
GLEAN3_20938 is a partial duplicate prediction for GLEAN3_2056.
###Gene_Info_Comments GLEAN3_20938 ###
GLEAN3_20938 is a partial duplicate prediction for GLEAN3_2056.
###Gene_Info_Comments GLEAN3_18123 ###
Incorrect gene model. Has an extra exon (or more) in the middle and is missing the last exon (or more).
###Gene_Info_Comments GLEAN3_23316 ###
Incorrect gene model. This appears to be a conserved gene across species.
###Gene_Info_Comments GLEAN3_26256 ###
GLEAN3_26256 is a longer duplicate prediction for GLEAN3_23423.
###Gene_Info_Comments GLEAN3_14179 ###
Likely missing an exon.
###Gene_Info_Comments GLEAN3_14882 ###
Incorrect gene model. Extra exon in middle?
###Gene_Info_Comments GLEAN3_22950 ###
Only second half of the WAPAL gene is encoded by GLEAN3_22950. First part is missing.

###Gene_Info_Comments GLEAN3_21483 ###
Incorrect gene model. Likely has an extra exon predicted.
###Gene_Info_Comments GLEAN3_01415 ###
Inspection of the tiling array suggests that glean may have missed the following exons: KSKRAPREEDTAPKRRREEAAGSSKQSPTKKKISSGRQAAGSGGGTPTQDELAPDPRESAKPAAQKRAEGPIKSDQTVRVEEKQESDSESSGRSSSGKGAKLASLPELME
###Gene_Info_Comments GLEAN3_02008 ###
Inspection of the tiling array suggests that glean may have missed the following exons: DSYRREILLLYSLWQGLPNETSSYQPRPCAHGRETLRVRDMPQSFHRAGYSPKAQDHPFWPEALQVRDLRPSLCRQKRPQLSC,CELCPRKFVRKNFLNAHMKLHQGIKPKKPPERSFTCTICNKVLKTRASYQTHNRIHTGEKSFCCTLCGKAFPTKPRLINHVRVHTGEKPYECETCHKAFTEPGTLRRHKIIHSGLKPYKCETCDRAFADKSALNSHVKMHTGQKSHSCEFCGKMFWTATNMRQHAKTHRKKSMFECGVCSKEIFGQENLTAHLVEHEAEQR,LSNVPSAISGFRQKGQRTQHQKKVHKVKMEEGNEAEGEVVSSEDGPVIIPNVKRVFKCRVCQVEFEAKEELKEHKLTHKELENGDDEYVPISVKVSRPKKVVETFKCDICNNSFAQKAYLERHRRVHTGEKPFGCTLCEKKFSDMTSLRRHKSIHTGAKPF,QCEVCEKFFKTKKTLQKHGAIHDEEKRYECDVCQKRFSRKAYLVSHSTIHTGEKPYTCEDCGRQFRDRSSMKRHMNTHKGIKRYECNVCQKQFTDKSAANIHLRIHTGEKPYECYECK
###Gene_Info_Comments GLEAN3_02009 ###
Inspection of the tiling array suggests that glean may have missed the following exons: DGQRLKHMRQKHSLAKCDECGACFEDEALLQRHMKMHSQVTMFMCDVCGSTFTKKSYLTFHMVVHEKEDLSERMSMKEVDEGKVPALPSKGQKQLQIVDDDEDDDDVGDGGNDPDDSDWEEPLAAKKKGSKFDCDRCSRSFASLRGLKMHQRMLHITVEEEPQSESSEEEEEDAKTEEDAMIEEKGGNDKSKHCPVCKKSFVSVRQLTRHESTHASWDCTYCSKTFRTSWILKEHLNTHTGQRPYQCTECDKTFKSHGALRRHTIIHKGTKPYRCDLCDMRFSDGSSLKSHKKRRSCR,RATRNVDHAGDKQVECDVCLKKFYTGFQMRTHRLTHGGQDHKEENLLRCESCSKAFLSPSGLEKHKKSGKCGKKFTCPFCTDSFIYKYEREKHMETHAEFVNKADKEDEVTDGVKKSKVFKCPICEQKFPNLRTFTIHRKKHERGKVYACEVCNKVFSTPISLKYHRKLHTGQGPKCSVCDKTFYNLKSLRRHERIHTGQKPYNCGFC,SLSYLSSTHTESFVQIVGRHSSTFPDSSLPQAESTNLNLETMGEDAQPTNQDVDTVMLETELTNESADTMTPEAELTDENADPVILVSEMTNQSADTMTPEAGLTNENADPVILVSEMTNQSADTMTPEAGLTNENADPVILASEMTNQSVDTMIPEDQPQTNAEPRSSSLEVQGKL,LSNVPSAISGFRQKGQRTQHQKKVHKVKMEEGNEAEGEVVSSEDGPVIIPNVKRVFKCRVCQVEFEAKEELKEHKLTHKELENGDDEYVPISVKVSRPKKVVETFKCDICNNSFAQKAYLERHRRVHTGEKPFGCTLCEKKFSDMTSLRRHKSIHTGAKPF
###Gene_Info_Comments GLEAN3_07169 ###
Inspection of the tiling array suggests that glean may have missed the following exons: TRYKRICDRHICDDPVLAMLNLPLPNINSKSITNFISKQEKRSISLQPLNPKSSNIAKAFLLIPFRHLSSFQRCSRTQGMHRLSTVSYLMNLASAGVRRGTHSSMNQGLFHLQGMSQHLAHLQCLYLLLVTNQNLSYPLLPPR
###Gene_Info_Comments GLEAN3_11001 ###
Inspection of the tiling array suggests that glean may have missed the following exons: FYVNFLLLFFRFSLVSLYIWRYHHGLPVKRCGKLATYHPVNRSATVISYLKNSQPCQPLDMASTSNPSLLPQKNIKQEIIVVFEPPAREMASSSSHSSVPQPETSQEINVVPVSPVKDMPSTSNPSSGHETGQENDMVPDSTGIELSKDDLKAGGGKVGSSSKKKSGCEKDSDEYKRRRERNNEAVRKSRQKSRQKASETEVRVTELKKENADLEQRVTLLHKELELLKDLFLTHANELPDPSTTFGLFNANPRLGSSSPNPALSRRIVLKTESLTVSLTCRNVPESITTTT
###Gene_Info_Comments GLEAN3_11380 ###
Inspection of the tiling array suggests that glean may have missed the following exons: FLDEGQVLLRLLLFVLDNPLIKVLIKQTFELLIYVCLGLYSVRMNCMLPGLILSCSIFYPAERAVRLFLTSTVSWPLEYYFLLTDQHAYLHLRSVAGCLEGMAISRGLLLLVCHLLTLVFALEHILQEGLCCSVHENQ
###Gene_Info_Comments GLEAN3_14683 ###
Inspection of the tiling array suggests that glean may have missed the following exons: SPVISDAQKFQCSHCEGLFSSAKLILRHIRCEHSDGEPCEMMPALAWKRKGKKKGREKSVAIKFKFNHPINHPIVRKRKNSEEEEECDFRCGTCVKSFPSLGRLKEHELFHEMMHGDKPYECSECNQRYTAQSSLNRHEREVHGFLDDYKPRSRPKRLKAHVPKKPLHCRYCGQGYKSRGALANHERRIHGSRHPIREPDLPNDEPKDMLGRPSDYYQRPFKCRFCPKRYVSWTTVEQHEKEVHTREGTFKCSHCPKVCASESRLKEHLVVHKYMHMHRCTLCPRSFASESALNNHQGEHTGLKPFKCEICSRGFRTRKLTLKHKQRMHQERPKRYICSICNKGFAEKCNLKVHERRHKGIRQFVCLECGKGFTARFSLTAHMQAMHIKERPFACEICGKSFALNHHYNHHMAKHRLDGDDSIPQ,RRMYRKSHFTVVTVAKGTNHAVHSRTTRGESMALGTRFGNRTYRTMSPRICLVDPLITTSDPSSADFVQRDTFPGQRLNNTRRRSTREKALSSAVIVPRFAPVRAV,SVKEIQTIKQREQCSSSSHQASASSSSSDTSNPTPNTSKDESQLLAALNLKKTKSIQDLPQNLLFRATPEGKVDGVVAKERIEKGVEFGPYAGTLLDEEQGWTRDTTWEVRRAVFHKTVF,FPLDSAHGVSNAGIIHQARQQLPVHLRGHGVKSSTLSANYAPPITTHEPIRERNDLPITTHESVSSIIQPLTTPESGAKSNVPRPQGTVCNFCLVGFC,TRCPGVRDRTLVVTQRSVNKLFVARNCHSFTLKKPRQFILWQFTAPYLNKPLLSLSLSLSPSLYATSLNLENELSSASTDSNLTLYH
###Gene_Info_Comments GLEAN3_14686 ###
Inspection of the tiling array suggests that glean may have missed the following exons: FSFNVSDAQKFQCSLCEDLFSSSKLILRHIRLEHTDGKPHDKLPLVTPKKREKKHVRLKISSKHLFKTKKKAREKEESDLKCATCGKVFLSSGRLKAHEIFHDYNQDHTCPICGKRQKNAQTWAKHMNLHKPASEARPHKCNECNGRYKSKAALRKHQHQVHGYPCRLCSERFSRMKDCKTHEQTHQAFIPPHAAVEYLVAELSPDEPKDMLGKRANYYQRRFKCRYCPKRYSDHYTVRNHEKENHTGEGTFKCSHCPKAYTSESRLKTHLLFHEQTHIYRCMLCPSSFASESALNSHQGEHTGLKPVKCDVCGKGFRTRKHSLAHRRRVHQERPKRFFCSYCNFGFAEKGDFNKHEQRHKGIRRYMCRECGMPFTSNTSLTAHIRALHTKERPFSCEICGKTFALNFKYTLHMVRHNVQVNGSSLQQQ,QNSLLMSRKTCLVNVPITTNEGSSAVIVQRDIVTITLSGTMRKRTIPVKEPSNAVIAQKLIPAKAVLKLTSCSMSRRTYTAACFVQAALHRKAPSTVTKENIPD,SQSNVMSVAKDLELGSIPLRIGVVFIRNVQSVSSAPTAILGLQKRAILTNMSSGTRALDDTCVGSVECPSPQIHLSQLTFEHCIPRKDHFHVKYAGKLLP,LGQSSIKSAPSLQSTDHRKQCLPSSKPFHQANASPPAPDVSAAEPPLFAALNLTKTISVMDLPLSLSLRVASQDKVEGVVAKDTVEKGVEFGPYTGTLLDEEQGSSKETTWEV,VSHLLNQLLHYKAPTIESSVYLRQSRFIKLMQVLRPQTSLLPNLLFLQRSISQRQSPSWIFHCLSHYELHRRTRSKEWLPRIQLRRGWSLDPTQEHCWMRSRDRLRRQPGRY,ANIIHQTPPQPPVTLPVHIRGHGVKSSTLSANYAPPINTTHGAVEEERNDLQIATHGSVEASKMLPLAFHKSVVERNFQPTTSYESGLHVESNALPIITCKSA
###Gene_Info_Comments GLEAN3_14687 ###
Inspection of the tiling array suggests that glean may have missed the following exons: QCRFPILKPLYHEIQIAVWMPQGAVQGMVSEADPLRTQADAVRLCGMVGPWGRVQHVAGVVAVVVAKVVDVVGFPVDLFQCSRNQVDNYLPLLIYYEGIKEN,VLNVAVHTLVKFEILESCMNSNILPVYLAVEHFQFTSWFNDLSASEVDIAVILICFHVLFRFTPCASQMRHPVFTSVNVEENPTWITDL
###Gene_Info_Comments GLEAN3_17682 ###
Inspection of the tiling array suggests that glean may have missed the following exons: LSCLVFFKLKDDPPDMPDMGFPLSSHDSADQSPLDTALSVSAMLVETGSDHNSDSDFMTGDGVIPGDMVSGFRESTINLQDLE,PPPLPPPPPKTLRRMPSHRLQLHPSPAHPSSRLSPCSLPAPGDYPYPLPWPRPSSPSSHRIHPPTPPHPPSSSPHLHWYHRHRR,DGCHPTGYSCTRLQRIPPLVFLRAHCQLRGTIHTPFLGHARPLHPPTEFILQPLRTLHHPPLIFTGIIVIVGEGVATSICSPYHHHHGHALDHAQGVPGEQEWLGKDRLSGGNCHTSAASRCSGALEEDR,INLTTTTTTTTTKDSETDAIPPVTAAPVSSASLLSSFSVLTASSGGLSIPPSLATPVLSILPPNSSSNPSAPSIILPSSSLVSSSSSVKESLPVSAPPTTTTTVTPSTTPKVYLENRSGSARTDYRAVTATHPLPLDALERLKKTGNY
###Gene_Info_Comments GLEAN3_17847 ###
Inspection of the tiling array suggests that glean may have missed the following exons: EWTGGRHYVMIQDWTDLEGGHHVVTAHRDENPNRPKETQRNRHVKRPRSLHSMMDRKLPLDIADNLRPNQKKSWITTKDPLHHLTCDIHLGKWRRRLHGLVSGLKLLVRMLKLKLTL,PPGHEIVLADRVTGKRVALCEGLVESPVNPTELWKKGPIVARSVVKADEDVLPVRVFNPTREQGIIKAGTAIASLSALEEMGTEMCKQTMPTETSASKNMTSDKQRNRDGDARRATTHFFQCDSCTKSYEKKDSLQRHKREVHIRKYRCGQCDYRTGRKTEIERHQGAHAVAVVPVRKHSLTPVSSPRCKPTSTPITGTSEKADASWRHVRMDWRSPLRDDPRLDRPRGRSPRRYRSSRRKSKSPQRDPTKSSCQKTPESPQHDGSEASPRHSRQSTPEPEEELDNDERSPSPLDLRYSSRKMASETAWASEWTEIASKNAETQTDT,ESPVNPIELWKKGPIVARSVVKTDEDVLPVRVFNPTREQRNIKAGTAIASLSAVEEVGTEMCNQTMPTETSASRNMPPDKQGNKDVDARRTTTHFFQCDSCTKSYEKKDSLQRHKREVHILKYRCDQCDYRTGRKTEMERHQGAHEVAVVPVRKHSLTPVSSPRCKPTSTPITGTSEKADASWRHVRMDWRSPPRDDPRLDRPRGRSPRRYRSSRRKSKSPQRDLTKSSCQKTPESPQHDGSEAS,WCQKWDPVTMAQGRNRQNRPTPGEMMKHLERSMTEVDQMMEQLRASMSQVPVADPEPGRQEAFPPTLSGQTASGPTADSQMPAAQRVRFSSTPREQSRRLPSAGTGVHITPPLFSPP
###Gene_Info_Comments GLEAN3_17848 ###
Inspection of the tiling array suggests that glean may have missed the following exons: EWTGGRHYVMIQDWTDLEGGHHVVTAHRDENPNRPKETQRNRHVKRPRSLHSMMDRKLPLDIADNLRPNQKKSWITTKDPLHHLTCDIHLGKWRRRLHGLVSGLKLLVRMLKLKLTL,PPGHEIVLADRVTGKRVALCEGLVESPVNPTELWKKGPIVARSVVKADEDVLPVRVFNPTREQGIIKAGTAIASLSALEEMGTEMCKQTMPTETSASKNMTSDKQRNRDGDARRATTHFFQCDSCTKSYEKKDSLQRHKREVHIRKYRCGQCDYRTGRKTEIERHQGAHAVAVVPVRKHSLTPVSSPRCKPTSTPITGTSEKADASWRHVRMDWRSPLRDDPRLDRPRGRSPRRYRSSRRKSKSPQRDPTKSSCQKTPESPQHDGSEASPRHSRQSTPEPEEELDNDERSPSPLDLRYSSRKMASETAWASEWTEIASKNAETQTDT,GKGEVYLAELRQRCRQPKESLQELGQTIRELCTLSYPEFDEKGQDRLARGHFLDAVVTPEIREGLFRAQPRTLDDAVEAALNTEAFLRMEGQRNEVKRSTTYSRALEECEVSAIREQQPRNPTIDEIVKKVLDALDMRNGRNTIKPDVPDRRPEQTMPTKVSEREDNRCFNCNELGHWRNQCPYPRKVRGGTAPPAAEKANTNLQWATANGMTDEEEQARVGSSQDPNRKGLFLE,RCVNCLSGCSIQQGNMGSQRQALPLPHCQRWKKWAQKCAIRRCPLKPVQAETCHPTSRETRTLMPEVRLPTSFNVTLVPKVMRRKTL,DFLVQINAVLDCQKMELRTEWGIIPCLDSEGESFCRRIVAGEEYSIPPGHEMVLANRVTGEKIALCEGLVESPVNPTELWKKGPIVARSVVKADEDVLIACQGVQFNKGTWDHKGRHCHCLIVSAGRSGHRNVQSDDAH,LPVRVFNSTREHGITKAGTAIASLSALEEVGTEMCNQTMPIETSASRNMPPDKQGDKDVDARGTTTHFFQCDSCTKSYEKKDSL,LPVRVFNSTREQGITKAGTAIASLSALEEVGTEMCNQTMPIETSASRNMPPDKQEDKDVDARRTTTHFFQCDSCTKSYEKKDSLQRHKREVHILKYRCDQCDYRTGRKTEMERHQGAHDVAVVP,CLSGCSIQQGNRGSQRQALPLPHCQRWKKWAQKCAIRRCPLKPVQAETCHPTSRKTRTLMPDVRLPTSFNVTLVPKVMRRKTLCKDTSGKCIS,WCQKWDPVTMAQGRNRQNRPTPGEMMKHLERSMTEVDQMMEQLRASMSQVPVADPEPGRQEAFPPTLSGQTASGPTADSQMPAAQRVRFSSTPREQSRRLPSAGTGVHITPPLFSPP,SPRRYRSSRRKSKSLQRDLTKSSCQKTPESPQHDGSEAPPRHSRQSTPEPGEELDNDERSPSPLDLRYSSRKMASETAWASEWTEIASKNAETQTDTDSSR,RNRHVKRPRSLHNMMDRKLPLDIADNLRPNQEKSWITTKGPLHHLTCDIHLGKWRRRLPGLVSGLKLLVRMLKLKLTLIVLVK
###Gene_Info_Comments GLEAN3_17849 ###
Inspection of the tiling array suggests that glean may have missed the following exons: EWTGGRHYVMIQDWTDLEGGHHVVTAHRDENPNRPKETQRNRHVKRPRSLHSMMDRKLPLDIADNLRPNQKKSWITTKDPLHHLTCDIHLGKWRRRLHGLVSGLKLLVRMLKLKLTL,PPGHEIVLADRVTGKRVALCEGLVESPVNPTELWKKGPIVARSVVKADEDVLPVRVFNPTREQGIIKAGTAIASLSALEEMGTEMCKQTMPTETSASKNMTSDKQRNRDGDARRATTHFFQCDSCTKSYEKKDSLQRHKREVHIRKYRCGQCDYRTGRKTEIERHQGAHAVAVVPVRKHSLTPVSSPRCKPTSTPITGTSEKADASWRHVRMDWRSPLRDDPRLDRPRGRSPRRYRSSRRKSKSPQRDPTKSSCQKTPESPQHDGSEASPRHSRQSTPEPEEELDNDERSPSPLDLRYSSRKMASETAWASEWTEIASKNAETQTDT,ESPVNPIELWKKGPIVARSVVKTDEDVLPVRVFNPTREQRNIKAGTAIASLSAVEEVGTEMCNQTMPTETSASRNMPPDKQGNKDVDARRTTTHFFQCDSCTKSYEKKDSLQRHKREVHILKYRCDQCDYRTGRKTEMERHQGAHEVAVVPVRKHSLTPVSSPRCKPTSTPITGTSEKADASWRHVRMDWRSPPRDDPRLDRPRGRSPRRYRSSRRKSKSPQRDLTKSSCQKTPESPQHDGSEAS,GKGEVYLAELRQRCRQPKESLQELGQTIRELCTLSYPEFDEKGQDRLARGHFLDAVVTPEIREGLFRAQPRTLDDAVEAALNTEAFLRMEGQRNEVKRSTTYSRALEECEVSAIREQQPRNPTIDEIVKKVLDALDMRNGRNTIKPDVPDRRPEQTMPTKVSEREDNRCFNCNELGHWRNQCPYPRKVRGGTAPPAAEKANTNLQWATANGMTDEEEQARVGSSQDPNRKGLFLE,RCVNCLSGCSIQQGNMGSQRQALPLPHCQRWKKWAQKCAIRRCPLKPVQAETCHPTSRETRTLMPEVRLPTSFNVTLVPKVMRRKTL,DFLVQINAVLDCQKMELRTEWGIIPCLDSEGESFCRRIVAGEEYSIPPGHEMVLANRVTGEKIALCEGLVESPVNPTELWKKGPIVARSVVKADEDVLIACQGVQFNKGTWDHKGRHCHCLIVSAGRSGHRNVQSDDAH,LPVRVFNSTREHGITKAGTAIASLSALEEVGTEMCNQTMPIETSASRNMPPDKQGDKDVDARGTTTHFFQCDSCTKSYEKKDSL,LPVRVFNSTREQGITKAGTAIASLSALEEVGTEMCNQTMPIETSASRNMPPDKQEDKDVDARRTTTHFFQCDSCTKSYEKKDSLQRHKREVHILKYRCDQCDYRTGRKTEMERHQGAHDVAVVP,CLSGCSIQQGNRGSQRQALPLPHCQRWKKWAQKCAIRRCPLKPVQAETCHPTSRKTRTLMPDVRLPTSFNVTLVPKVMRRKTLCKDTSGKCIS,SPRRYRSSRRKSKSLQRDLTKSSCQKTPESPQHDGSEAPPRHSRQSTPEPGEELDNDERSPSPLDLRYSSRKMASETAWASEWTEIASKNAETQTDTDSSR,RNRHVKRPRSLHNMMDRKLPLDIADNLRPNQEKSWITTKGPLHHLTCDIHLGKWRRRLPGLVSGLKLLVRMLKLKLTLIVLVK
###Gene_Info_Comments GLEAN3_17850 ###
Inspection of the tiling array suggests that glean may have missed the following exons: ESPVNPIELWKKGPIVARSVVKTDEDVLPVRVFNPTREQRNIKAGTAIASLSAVEEVGTEMCNQTMPTETSASRNMPPDKQGNKDVDARRTTTHFFQCDSCTKSYEKKDSLQRHKREVHILKYRCDQCDYRTGRKTEMERHQGAHEVAVVPVRKHSLTPVSSPRCKPTSTPITGTSEKADASWRHVRMDWRSPPRDDPRLDRPRGRSPRRYRSSRRKSKSPQRDLTKSSCQKTPESPQHDGSEAS,WCQKWDPVTMAQGRNRQNRPTPGEMMKHLERSMTEVDQMMEQLRASMSQVPVADPEPGRQEAFPPTLSGQTASGPTADSQMPAAQRVRFSSTPREQSRRLPSAGTGVHITPPLFSPP
###Gene_Info_Comments GLEAN3_19384 ###
Inspection of the tiling array suggests that glean may have missed the following exons: TWIVTFGVRFHINADIVDIMKDEIAAAVVFLTRIVKRNTSLTAEQMSKFSEKLALTLIEKFRNHWYEDKPSKGQAYRCIRVSRNEPRDSVISKTAKDCGIHYNHLNLPAELCLWVDPLEVSCR,NQICCRFVNRHAPFSFICRFGERGTVCEVATFNQQTLTDNRPSTPISSSNNAFNDNNANQLSPTPSPPSSPPRQLVINDNLRPNSGRKMVNSSQYVFNRNATNTPRVIQRPQNKVWVRQPNPEQYRWVNKSIAGRA
###Gene_Info_Comments GLEAN3_21758 ###
Inspection of the tiling array suggests that glean may have missed the following exons: EGLTPEALAQAGLTQNYVNAFTQQTLESLANSQGDITAENQISLQTQQIQQQLEAITGQSASLFSHAVNIQPVQEELPPPPPQETNRLPTTNTSSVFATNEGEKKTYSCHFCEKTFKKSSHLKQHIRSHTGEKPCKCMQCGRSFVSASTLRNHMRTHSGIKSFKCNLCNTSFTTNGSLVRHMNIHTDQRPHTCQQCGEAFRKQLDLKRHLKDHATESDDGEQLEDGKRPRNVIRFNEEEAQQIAKKPLARNATTSERILVQMVNDKNRVSEVLTQEDVLKSRPNFPNRCIHCSKSFKKPCDLVRHVRTHTGEKPFKCTECERTFAVKSTLVCHVRTHKGGKQEVCHICKTT,QVVVQNSGIQNIIPSLQNQSFIPSTSGMTTQVVAPPNSVAPFLQGSDVQMTIHDTLAQNVSHGEALMTNKMYTIHTTDRGTHLVQTDTTPSQADTSNLGQSFQLSLSGDQLSLQPNILLQQPNIQNPVEYTPSSSIAHSQEQITISTNGALGDSESSGTVTVNVVDFANLASHENVQTTASSSAAVPLQEEEESGEEDEDDDDVDDEDEEELEEESEDELQYVTEGGNLPSTGPMIAGHHVMREKPAMSAKDAAAAIGSIYPCT,EKPYKCPHCDKAFNQNGALHVHLTKHTGTKPHSCEFCGQKFAQRGNLRAHIVRVHEINSELEQRFECPQCCCNFRKMSSLNAHISRFHNSEEASSPLLSLKAALSAIQEGADWLDSTVDTPALAGALEQLINMLEENAGEGDPLDRIQEIMSGNTINTDILQQALDNSGVTNNVVEGQTETPAPAAPVAPVAPDAPDPAPEIPTASTTAPTVSTETTVAPANNQLNQLFSFVAE,KCLICDCLFTTNGSLKRHMSTHTDVRPFMCPYCQKTFKTSVNCKKHMKMHKRELALQAKQQENIEQDPNHVQEIQSIAEQASNLAEHLAIEQASNSILALAQGQLTVGQDGLHGANALNETS
###Gene_Info_Comments GLEAN3_22024 ###
Inspection of the tiling array suggests that glean may have missed the following exons: YFSTMTIRHPLIHFAIQTYKYSSFKRQYWKTNGGEVCREDQRQRRQDNEGWPKRKRFISTSGSGVKRKLEGACQKEVYIHICQ,LCSQLSWRVKHLKHISVWTLYEGASIKDSDRRLRGAVDTHYIITGPEQTIRQSASKLLVSSAFKNELSNFILKEWGKEHYWNIYSGRTRFASYGGG
###Gene_Info_Comments GLEAN3_23140 ###
Inspection of the tiling array suggests that glean may have missed the following exons: RAFPPAGLMNPRSSDCVGVTRPEGHLFGVGFVGLGGSEVVSVAAKQIFLTCSCDSAKLIPGTALHNLQYGFRNVVPFSSVTKFW,DTMARVLLIDTVKLSRAPRVLLSSKARQVVMAKTQVHKKCETTDREVQSSGTDVPSPLIICRSCARKSTTHNTMMSLLLTQRRM
###Gene_Info_Comments GLEAN3_23510 ###
Inspection of the tiling array suggests that glean may have missed the following exons: GFCYEKSLLLHMKSHIGEMPHKGLVCKRGLSSNSFLLRHIRSHTGEKPYQCLVCGKSFAHNSTLKLHTLRTHPGELRASFLEGISGAPTRENPFHLFWGACLVPSGLCIRSGCRVT,SKLDISSCSLCSGGNSRMQSSHQSMSGDRSHGGERRPGLPEDFPLETGRGTRSNSSERESCPSEYDSCLIPSQVAYFHEKRAADRETADATTSYHKPGGMHKDRNADITMEYDEMINQSSQLRAAGSDQNEPSCSLTEEKLFLCC,THTGEKPYQCKFCDKKYSRSSSLGVHIRTHTGEKPYQCKHCDVSFSRVETLSRHIGTHTGKEPFECSFCKKTFSHNGHLSRHLKIHTGERPFKCSVCNKTFSERGYLKDHQVIH,EGSHFKSKIHISSHSLCSGGDSSLQSSYQSMRNTQEMSGDRSCDGARHLPEDFPLQDYLGTQSDSSERESCTSENDSYLIPSQVACFHEKQAADRETLDATTSNHKPEGIDID
###Gene_Info_Comments GLEAN3_24746 ###
Inspection of the tiling array suggests that glean may have missed the following exons: NLTEHMGTHIGASPHHCPLCKSTFTRLSSLKKHVRNHHGKCPFQCVFCNKTFNEKDDLLVHVKIHTKKDLYHCLLCNKPWPTIRGLSFHIQLTHKDKSLASVLPETKHSPQGISSQEIPKPTQENIIPVALFSTTNPKDHSVNP
###Gene_Info_Comments GLEAN3_24902 ###
Inspection of the tiling array suggests that glean may have missed the following exons: FFLSLFRFLHMKKSMCVDQSTQTMPDIYSCSHCGSSLLAKPMPPLARPQPSGPQQLPGLESSQARCINKKDNGDDYTYDDGIVTLDNEKELLGPDKWVSEAPDHPVAIFVKEEVMSDLEQSNSMHSDKEESFFDPRQDQEFRSDSKLEHSETGWYEEEEEEEEEEEEDEDEMEDDEEIDYEALERLDPTYDPFIGSRPHQLVSIVNF,GVSDDEFQDKQDRVKKCLTGNHSSVRSECPGVCPLCSTIWNSLADRTSHLQTHIPKDQGRQAFKEACEAIEKKIGKEYTQRMGWCEVCDKFSSSLHTHMSNV
###Gene_Info_Comments GLEAN3_25849 ###
Inspection of the tiling array suggests that glean may have missed the following exons: SFVFSVSIHFGDEQHIETEHGVAQPVQTKDTVPSRGKLVTPATVSQHQDSMEVQIDYDDDEDNSYAAWMTEPDDGNDSDTQPSDTAKAVESDCPSAIQQPREEHPCPDCHVILHSTWDLDLHKLHDCTESNNSIQYVYNCDQCRKSFHRWAFTVAHLRKVHNCNMSHSEIIKKLDELKSTKSKKSGNKDEKCLPSEKKAPKRPGLIEAKGQTTRKKVARRKKRDNGDMQEVKKQTKSDRRNAPRLSKRPCIRLNKDLIKIFESERDDENNLTTNNRSIKPERTAGVSNVEKAQQRHTSDEHQNLNFKTEANNSSGGELATGRLNDERQGQGSTAEQVEEKGEDDLILMKEVELVIKDGERETSATIAMPESEISKAELEEETGNTLISQRDEGQGQAAEGVEENQDASANPIMEKKRKIHTADLKVHESQEGKSEPRDMIEHKASHETAERETKEASNLDADDDENNFTTIQKSTENAIVCSLCDLQFETKHDRSKHMPSHKEHRLQYKCSTCGKTFNRKVIYRTHVETHQDKTKRQKYH,RRECLKPTMRKSQEEMVSLINLLPHHLHPQTQGLTTLTMTVQTSKEAMLLKLLRPVMGKWLETGASPKRGCAGCNTPQPVRILSLKLHCRRVKSEKGERKKKR,LTAHTDVYKYSCDVCGKKFKRTSLRNSHMKVHSNDPANKPFKCELCSKAFAAQGKLKVHMDWHYNIRSYTCDVCGKSFLTKGNLDKHQFLHKGTKPHECKICSRGFMDLPGLRKHLDLVHKITLKKVVTQRVLEANDAKESGGDGVPHKPAATPSSPSNSGSDDADNDSPDFKRGNVAQAPEASDGQVVRDRGVSKEGVCRVQHPPTCQDIVAKIALPSGKKRKRGEEKKEE,NEIFFILFFNLLCSIPLTLRFIVQSFPTGIIGFSFEVKVLMFIRSVPLLCFCYVGNPCSSLRFNNHVVRGELIFIISFTFENLYPIFIQTDTKSLGKQRCLSSVSFCLLFHFLCISYIYFLPPCHLLPHLPSCFSEAWTSLNYCLRWKAFFIRIAIF,RSRPGAFLVLGGFCLGAGIGLLGAGTGLCCTVLSASLCNNPPSDMSCICSVSVSAWTPTPSSTRSSVCDIPLSSSDWSTC,TLYNHAICADLSPMCCVRSHGNPGMGVPLSADELHSGSHFLPLLRCLMFGCHYHLHHQVLSSMQHKNYPRHHRNRSVPPWNLGVVKR
###Gene_Info_Comments GLEAN3_27708 ###
Inspection of the tiling array suggests that glean may have missed the following exons: SRCHVHASFVGRHVLSCSESLAAEVALVGPFTPVFWKVVLERILRREGAWALLALEGPFQRVFPPVHDKPGLFIECQWTELTFVSTTIRMHSHFVLVHLVVSGELLWAVLTFEDLWFLAGDKYSLSKHVLW,TWRSTIITLAVAWLTLWQSSISWVECGKCRLLSRHNTLVPMRDSKCESFDQQSGDISTRLKLVHDDTNRMSKLSNNSSPEWLRL
###Gene_Info_Comments GLEAN3_01634 ###
motor domain
###Gene_Info_Comments GLEAN3_07083 ###
Motor domain
###Gene_Info_Comments GLEAN3_03746 ###
motor domain 
###Gene_Info_Comments GLEAN3_01633 ###
motor domain
###Gene_Info_Comments GLEAN3_07595 ###
GLEAN3_16421 is a partial duplicate prediction for GLEAN3_07595.
###Gene_Info_Comments GLEAN3_16421 ###
GLEAN3_16421 is a partial duplicate prediction for GLEAN3_07595.
###Gene_Info_Comments GLEAN3_05795 ###
Missing first ~80-100 AA.
###Gene_Info_Comments GLEAN3_15843 ###
GLEAN3_02037 is a partial duplicate prediction for GLEAN3_15843.
###Gene_Info_Comments GLEAN3_02037 ###
GLEAN3_02037 is a partial duplicate prediction for GLEAN3_15843.
###Gene_Info_Comments GLEAN3_26535 ###
GLEAN3_26535 has part II of the gene. GLEAN3_19403 may code for Part I. There is still ~100 AA missing between the two GLEAN's.
###Gene_Info_Comments GLEAN3_19403 ###
GLEAN3_26535 has part II of the gene. GLEAN3_19403 may code for Part I. There is still ~100 AA missing between the two GLEAN's.
###Gene_Info_Comments GLEAN3_21620 ###
GLEAN3_05597 has most of the gene but a small part that is better coded by GLEAN3_21620.
###Gene_Info_Comments GLEAN3_05597 ###
GLEAN3_05597 has most of the gene but a small part that is better coded by GLEAN3_21620.
###Gene_Info_Comments GLEAN3_10009 ###
Incorrect gene model. May be a mix of two separate genes.
###Gene_Info_Comments GLEAN3_13688 ###
Incorrect gene prediction. GLEAN3_13687 is a partial duplicate prediction for UNC50 gene.
###Gene_Info_Comments GLEAN3_13687 ###
Incorrect gene prediction. GLEAN3_13687 is a partial duplicate prediction for UNC50 gene.
###Gene_Info_Comments GLEAN3_03464 ###
May have an extra exon at beginning.
###Gene_Info_Comments GLEAN3_01579 ###
GLEAN3_25102 is a partial duplicate prediction for GLEAN3_01579.
###Gene_Info_Comments GLEAN3_25102 ###
GLEAN3_25102 is a partial duplicate prediction for GLEAN3_01579.
###Gene_Info_Comments GLEAN3_10312 ###
Missing first ~20 AA.
###Gene_Info_Comments GLEAN3_16553 ###
GLEAN3_16553 and GLEAN3_18172 both overlapping but non-complete models for UTP14A.
###Gene_Info_Comments GLEAN3_18172 ###
GLEAN3_16553 and GLEAN3_18172 both overlapping but non-complete models for UTP14A.
###Gene_Info_Comments GLEAN3_14712 ###
GLEAN3_20985 is a partial duplicate prediction for GLEAN3_14712.
###Gene_Info_Comments GLEAN3_20985 ###
GLEAN3_20985 is a partial duplicate prediction for GLEAN3_14712.
###Gene_Info_Comments GLEAN3_28809 ###
Incorrect gene model. Extra exon(s).
###Gene_Info_Comments GLEAN3_08110 ###
e val = 0 against NP_036442.
Exact match to XP_795366: PREDICTED: similar to Chromosome-associated kinesin KIF4A(Chromokinesin) [Strongylocentrotus purpuratus].
3447 nts spread over 25 exons.
This same sequence is found on Scaffoldi3148 from sp_20060316_asm.
Annotated by RA Obar, RL Morris, AL Silverio, BJ Chick,  AM Musante, AS Shorette.
###Gene_Info_Comments GLEAN3_25767 ###
e val for Q02224=e-3e-38; CENPE_HUMAN [Homo sapiens].  
e val for AAI10990=2e-30; KIF19 protein [Homo sapiens].
Kinesin-7 family member.  
See also Glean3_17809 and Glean3_23126 which also hit Q02224.
CENPE_HUMAN data obtained from UniProtKB/Swiss-Prot entry Q02224.   
Annotation by RA Obar, RL Morris, SA Tower, and AP Rawson.  
###Gene_Info_Comments GLEAN3_06544 ###
Incorrect gene model. Excellent homology to xylulokinase only in the latter half of the gene model. Annotated as such.

###Gene_Info_Comments GLEAN3_19081 ###
GLEAN3_04122 is a partial duplicate prediction for GLEAN3_19081. GLEAN3_19081 is likely missing a few amino acids in middle.
###Gene_Info_Comments GLEAN3_04122 ###
GLEAN3_04122 is a partial duplicate prediction for GLEAN3_19081. GLEAN3_19081 is likely missing a few amino acids in middle.
###Gene_Info_Comments GLEAN3_04700 ###
GLEAN3_04741 is a partial duplicate prediction for GLEAN3_04700.
###Gene_Info_Comments GLEAN3_04741 ###
GLEAN3_04741 is a partial duplicate prediction for GLEAN3_04700.
###Gene_Info_Comments GLEAN3_06047 ###
GLEAN3_06047 is missing a few AA at beginning and end. GLEAN3_15592 is a partial duplicate prediction that goes to the end of the CCT8 protein. There is a significant overlap between the two predictions.
###Gene_Info_Comments GLEAN3_15592 ###
GLEAN3_06047 is missing a few AA at beginning and end. GLEAN3_15592 is a partial duplicate prediction that goes to the end of the CCT8 protein. There is a significant overlap between the two predictions.
###Gene_Info_Comments GLEAN3_09300 ###
Likely missing ~50 AA in middle (one or more exon).
###Gene_Info_Comments GLEAN3_22270 ###
GLEAN3_22269 is a partial duplicate prediction for GLEAN3_22270.
###Gene_Info_Comments GLEAN3_22269 ###
GLEAN3_22269 is a partial duplicate prediction for GLEAN3_22270.
###Gene_Info_Comments GLEAN3_24103 ###
Incorrect gene model. Longer than necessary.
###Gene_Info_Comments GLEAN3_28735 ###
Missing ~100 AA.
###Gene_Info_Comments GLEAN3_01512 ###
GLEAN3_17897 is a partial duplicate prediction for GLEAN3_01512.
###Gene_Info_Comments GLEAN3_17897 ###
GLEAN3_17897 is a partial duplicate prediction for GLEAN3_01512.
###Gene_Info_Comments GLEAN3_13005 ###
GLEAN3_02924 is a partial duplicate prediction for GLEAn3_13005.
###Gene_Info_Comments GLEAN3_02924 ###
GLEAN3_02924 is a partial duplicate prediction for GLEAn3_13005.
###Gene_Info_Comments GLEAN3_17669 ###
GLEAN3_25669 is a partial duplicate prediction for GLEAN3_17669.
###Gene_Info_Comments GLEAN3_25669 ###
GLEAN3_25669 is a partial duplicate prediction for GLEAN3_17669.
###Gene_Info_Comments GLEAN3_04893 ###
GLEAN3_1686 is a partial duplicate prediction for GLEAN3_04893.
###Gene_Info_Comments GLEAN3_01686 ###
GLEAN3_1686 is a partial duplicate prediction for GLEAN3_04893.
###Gene_Info_Comments GLEAN3_14697 ###
GLEAN3_11718 is a partial duplicate prediction for GLEAN3_14697.
###Gene_Info_Comments GLEAN3_11718 ###
GLEAN3_11718 is a partial duplicate prediction for GLEAN3_14697.
###Gene_Info_Comments GLEAN3_20801 ###
May be missing ~100 AA.
###Gene_Info_Comments GLEAN3_12946 ###
GLEAN3_05913 is a partial duplicate prediction for GLEAN3_12946.
###Gene_Info_Comments GLEAN3_05913 ###
GLEAN3_05913 is a partial duplicate prediction for GLEAN3_12946.
###Gene_Info_Comments GLEAN3_21795 ###
Missing first ~80 AA.
###Gene_Info_Comments GLEAN3_13396 ###
GLEAN3_00788 has first part of the TIMM44 gene. GLEAN3_13396 has the rest.
###Gene_Info_Comments GLEAN3_00778 ###
GLEAN3_00788 has first part of the TIMM44 gene. GLEAN3_13396 has the rest.
###Gene_Info_Comments GLEAN3_08672 ###
GLEAN3_06174 is a partial duplicate prediction for GLEAN3_08672.
###Gene_Info_Comments GLEAN3_06174 ###
GLEAN3_06174 is a partial duplicate prediction for GLEAN3_08672.
###Gene_Info_Comments GLEAN3_10671 ###
Likely missing the last exon.
###Gene_Info_Comments GLEAN3_17059 ###
GLEAN3_19019 has the first part and GLEAN3_17059 the rest. There is a significant overlap between the two.
###Gene_Info_Comments GLEAN3_19091 ###
GLEAN3_19019 has the first part and GLEAN3_17059 the rest. There is a significant overlap between the two.
###Gene_Info_Comments GLEAN3_20010 ###
Only the first half of TOP3A is predicted by this GLEAN, rest is missing from the predictions.
###Gene_Info_Comments GLEAN3_17723 ###
Missing the first half of the gene.
###Gene_Info_Comments GLEAN3_05321 ###
Likely has an extra exon predicted.

###Gene_Info_Comments GLEAN3_21928 ###
Missing first ~40 AA.
###Gene_Info_Comments GLEAN3_11091 ###
Model is missing ~50 AA.
###Gene_Info_Comments GLEAN3_27066 ###
Missing first half.
###Gene_Info_Comments GLEAN3_03970 ###
GLEAN3_09819 is a partial duplicate prediction for GLEAN3_03970.
###Gene_Info_Comments GLEAN3_09819 ###
GLEAN3_09819 is a partial duplicate prediction for GLEAN3_03970.
###Gene_Info_Comments GLEAN3_15304 ###
Incomplete model. First half is not correctly predicted.
###Gene_Info_Comments GLEAN3_13174 ###
GLEAN3_20504 is a partial duplicate prediction for GLEAN3_13174.
###Gene_Info_Comments GLEAN3_20504 ###
GLEAN3_20504 is a partial duplicate prediction for GLEAN3_13174.
###Gene_Info_Comments GLEAN3_15044 ###
GLEAN3_11230 is a partial duplicate prediction for GLEAN3_15044.
###Gene_Info_Comments GLEAN3_11230 ###
GLEAN3_11230 is a partial duplicate prediction for GLEAN3_15044.
###Gene_Info_Comments GLEAN3_11567 ###
GLEAN3_11567 has first part of the gene and GLEAN3_25450 has the rest.
###Gene_Info_Comments GLEAN3_25450 ###
GLEAN3_11567 has first part of the gene and GLEAN3_25450 has the rest.
###Gene_Info_Comments GLEAN3_27693 ###
GLEAN3_22721 is a partial duplicate prediction for GLEAN3_27693.
###Gene_Info_Comments GLEAN3_22721 ###
GLEAN3_22721 is a partial duplicate prediction for GLEAN3_27693.
###Gene_Info_Comments GLEAN3_08040 ###
GLEAN3_27793 is a partial duplicate prediction for GLEAN3_08040.
###Gene_Info_Comments GLEAN3_27793 ###
GLEAN3_27793 is a partial duplicate prediction for GLEAN3_08040.
###Gene_Info_Comments GLEAN3_15114 ###
GLEAN3_15114 is a partial duplicate prediction for GLEAN3_15351.
###Gene_Info_Comments GLEAN3_01978 ###
Incorrect gene model. Longer than required.
GLEAN3_03188 is a partial duplicate prediction for GLEAN3_01978.
###Gene_Info_Comments GLEAN3_03183 ###
GLEAN3_03188 is a partial duplicate prediction for GLEAN3_01978.
###Gene_Info_Comments GLEAN3_12816 ###
GLEAN3_12816 has first part of the gene and GLEAN3_25309 has the latter. 
###Gene_Info_Comments GLEAN3_25309 ###
GLEAN3_12816 has first part of the gene and GLEAN3_25309 has the latter. 
###Gene_Info_Comments GLEAN3_27769 ###
GLEAN3_27769 is a partial duplicate prediction for GLEAN3_02606.
###Gene_Info_Comments GLEAN3_28866 ###
GLEAN3_28866 is a partial duplicate prediction for GLEAN3_22607.
###Gene_Info_Comments GLEAN3_26082 ###
Transcriptome and alignment with best blast hit suggests that prediction may lack an N-terminal exon
###Gene_Info_Comments GLEAN3_19582 ###
GLEAN3_19260 is a partial duplicate prediction for GLEAN3_19582.
###Gene_Info_Comments GLEAN3_19260 ###
GLEAN3_19260 is a partial duplicate prediction for GLEAN3_19582.
###Gene_Info_Comments GLEAN3_21510 ###
GLEAN3_01446 has the first part of the HECTD1 gene. GLEAN3_21510 has the rest. There is an overlap between the two GLEANs.
###Gene_Info_Comments GLEAN3_05518 ###
Missing an exon at beginning.
###Gene_Info_Comments GLEAN3_19431 ###
Missing first half of the gene. GLEAN3_09004 is a partial duplicate prediction for GLEAN3_19431.
###Gene_Info_Comments GLEAN3_09004 ###
Missing first half of the gene. GLEAN3_09004 is a partial duplicate prediction for GLEAN3_19431.
###Gene_Info_Comments GLEAN3_03914 ###
GLEAN3_05017 is a partial duplicate prediction for GLEAN3_03914.
###Gene_Info_Comments GLEAN3_05017 ###
GLEAN3_05017 is a partial duplicate prediction for GLEAN3_03914.
###Gene_Info_Comments GLEAN3_07280 ###
GLEAN3_07280, GLEAN3_13615, GLEAN3_21285 and GLEAN3_27378 are all "phospholipid scramblase like" sequences.
###Gene_Info_Comments GLEAN3_13615 ###
GLEAN3_07280, GLEAN3_13615, GLEAN3_21285 and GLEAN3_27378 are all "phospholipid scramblase like" sequences.
###Gene_Info_Comments GLEAN3_21285 ###
GLEAN3_07280, GLEAN3_13615, GLEAN3_21285 and GLEAN3_27378 are all "phospholipid scramblase like" sequences.
###Gene_Info_Comments GLEAN3_05123 ###
GLEAN3_05123 has the first part of SCFD1 gene and GLEAN3_05492 has the other half.
###Gene_Info_Comments GLEAN3_05492 ###
GLEAN3_05123 has the first part of SCFD1 gene and GLEAN3_05492 has the other half.
###Gene_Info_Comments GLEAN3_17144 ###
GLEAN3_17144 appears to have the first half of the VPS33B gene, while GLEAN3_02218 may have the remaining part.
###Gene_Info_Comments GLEAN3_02218 ###
GLEAN3_17144 appears to have the first half of the VPS33B gene, while GLEAN3_02218 may have the remaining part.
###Gene_Info_Comments GLEAN3_24158 ###
May be missing the last exon.
###Gene_Info_Comments GLEAN3_05862 ###
GLEAN3_01745 is a duplicate prediction for GLEAN3_05862.
###Gene_Info_Comments GLEAN3_01745 ###
GLEAN3_01745 is a duplicate prediction for GLEAN3_05862.
###Gene_Info_Comments GLEAN3_02387 ###
Incorrect gene model. Likely extra exon(s) predicted for the longer model GLEAN3_02387. GLEAN3_16361 has better sequence homology with GBF1 but is incomplete.
###Gene_Info_Comments GLEAN3_16361 ###
Incorrect gene model. Likely extra exon(s) predicted for the longer model GLEAN3_02387. GLEAN3_16361 has better sequence homology with GBF1 but is incomplete.
###Gene_Info_Comments GLEAN3_03406 ###
GLEAN3_25852 is a partial duplicate prediction for GLEAN3_03406.
GLEAN3_03406 is likely missing the last exon(s).
###Gene_Info_Comments GLEAN3_25852 ###
GLEAN3_25852 is a partial duplicate prediction for GLEAN3_03406.
###Gene_Info_Comments GLEAN3_19999 ###
Missing first ~80 AA.
###Gene_Info_Comments GLEAN3_06680 ###
Incorrect gene model. Longer than is necessary.
###Gene_Info_Comments GLEAN3_08487 ###
GLEAN3_08487 is a duplicate prediction for GLEAN3_13194.
###Gene_Info_Comments GLEAN3_13647 ###
GLEAN3_13648 is a duplicate prediction for GLEAN3_13647.
###Gene_Info_Comments GLEAN3_13648 ###
GLEAN3_13648 is a duplicate prediction for GLEAN3_13647.
###Gene_Info_Comments GLEAN3_27976 ###
Likely missing first exon.
###Gene_Info_Comments GLEAN3_10670 ###
Incorrect gene model. Prediction longer than necessary.
###Gene_Info_Comments GLEAN3_03197 ###
Incorrect gene model. Likely a hybrid prediction.
###Gene_Info_Comments GLEAN3_13577 ###
Extra ~40AA (exon).
###Gene_Info_Comments GLEAN3_03805 ###
GLEAN3_25072 is a similar prediction.
###Gene_Info_Comments GLEAN3_25072 ###
GLEAN3_03805 is a similar prediction.

###Gene_Info_Comments GLEAN3_00723 ###
Missing first ~25 AA.
###Gene_Info_Comments GLEAN3_26892 ###
GLEAN3_26892 has most of the TUBGCP3 gene. Last 150 aa are likely encoded by GLEAN3_10158. GLEAN3_21617 is a partial duplicate prediction for GLEAN3_26892.
###Gene_Info_Comments GLEAN3_21617 ###
GLEAN3_26892 has most of the TUBGCP3 gene. Last 150 aa are likely encoded by GLEAN3_10158. GLEAN3_21617 is a partial duplicate prediction for GLEAN3_26892.
###Gene_Info_Comments GLEAN3_10158 ###
GLEAN3_26892 has most of the TUBGCP3 gene. Last 150 aa are likely encoded by GLEAN3_10158. GLEAN3_21617 is a partial duplicate prediction for GLEAN3_26892.
###Gene_Info_Comments GLEAN3_11574 ###
Exons 24, 14, 31 and 46 are not conserved or expressed according to the transcriptome data and are therefore questionable.
###Gene_Info_Comments GLEAN3_09648 ###
Missing first ~100 AA.
###Gene_Info_Comments GLEAN3_02510 ###
GLEAN3_28365 is a partial duplicate prediction for GLEAN3_02510.
###Gene_Info_Comments GLEAN3_28635 ###
GLEAN3_28365 is a partial duplicate prediction for GLEAN3_02510.
###Gene_Info_Comments GLEAN3_01119 ###
GLEAN3_01119 and GLEAN3_23988 are overlapping incomplete predictions for SSRP1. GLEAN3_01119 is missing ~100 AA at the beginning and end of SSRP1. GLEAN3_23988 is missing ~250 AA at the beginning but goes all the way to the end of SSRP1.
###Gene_Info_Comments GLEAN3_23988 ###
GLEAN3_01119 and GLEAN3_23988 are overlapping incomplete predictions for SSRP1. GLEAN3_01119 is missing ~100 AA at the beginning and end of SSRP1. GLEAN3_23988 is missing ~250 AA at the beginning but goes all the way to the end of SSRP1.
###Gene_Info_Comments GLEAN3_22853 ###
GLEAN3_22853 is a partial duplicate prediction for GLEAN3_19373.
###Gene_Info_Comments GLEAN3_11902 ###
GLEAN3_07972 is a partial duplicate prediction for GLEAN3_11902.
###Gene_Info_Comments GLEAN3_07972 ###
GLEAN3_07972 is a partial duplicate prediction for GLEAN3_11902.
###Gene_Info_Comments GLEAN3_03845 ###
GLEAN3_22665 encodes the first part of the RABEP1 gene and GLEAN3_03845 appears to encode the latter half. There is overlap between the two predictions.
###Gene_Info_Comments GLEAN3_22665 ###
GLEAN3_22665 encodes the first part of the RABEP1 gene and GLEAN3_03845 appears to encode the latter half. There is overlap between the two predictions.
###Gene_Info_Comments GLEAN3_11194 ###
PREDICTED: similar to fertility related protein WMP1 [Strongylocentrotus purpuratus],spermatogenesis-associated protein 7 [Rattus norvegicus]
###Gene_Info_Comments GLEAN3_11195 ###
PREDICTED: hypothetical protein [Mus musculus]

###Gene_Info_Comments GLEAN3_11196 ###
Tryptophanyl-tRNA synthetase, cytoplasmic (Tryptophan--tRNA ligase) (TrpRS)
###Gene_Info_Comments GLEAN3_11198 ###
PREDICTED: similar to polymerase (DNA directed), alpha 2 (70kD subunit) [Strongylocentrotus purpuratus]

###Gene_Info_Comments GLEAN3_11201 ###
PREDICTED: hypothetical protein [Strongylocentrotus purpuratus],PREDICTED: similar to golgi-specific brefeldin A-resistance guanine nucleotide exchange factor 1 [Strongylocentrotus purpuratus]

###Gene_Info_Comments GLEAN3_03899 ###
hypothetical protein LOC423694 [Gallus gallus],

###Gene_Info_Comments GLEAN3_03902 ###
myosin V [Strongylocentrotus purpuratus]
###Gene_Info_Comments GLEAN3_03903 ###
PREDICTED: hypothetical protein
###Gene_Info_Comments GLEAN3_23288 ###
PREDICTED: hypothetical protein 
###Gene_Info_Comments GLEAN3_23289 ###
PREDICTED: similar to Ribonuclease Oy (RNase Oy)
###Gene_Info_Comments GLEAN3_08157 ###
hypothetical protein DDBDRAFT_0206412 [Dictyostelium discoideum AX4]
###Gene_Info_Comments GLEAN3_08158 ###
PREDICTED: similar to peroxisomal biogenesis factor 6-like protein [Strongylocentrotus purpuratus]
###Gene_Info_Comments GLEAN3_14604 ###
PREDICTED: similar to smad nuclear interacting protein [Strongylocentrotus purpuratus]

###Gene_Info_Comments GLEAN3_15924 ###
PREDICTED: similar to acid-sensing ion channel 1, partial [Strongylocentrotus purpuratus]

###Gene_Info_Comments GLEAN3_25080 ###
PREDICTED: similar to Leucine rich repeat containing 15 [Strongylocentrotus purpuratus]

###Gene_Info_Comments GLEAN3_25081 ###
PREDICTED: similar to Transmembrane and coiled-coil domains 2 [Strongylocentrotus purpuratus]

###Gene_Info_Comments GLEAN3_23431 ###
PREDICTED: similar to selectin-like protein
###Gene_Info_Comments GLEAN3_07137 ###
PREDICTED: similar to LDL receptor adaptor protein (ARH) [Bos taurus]

###Gene_Info_Comments GLEAN3_07138 ###
PREDICTED: hypothetical protein [Strongylocentrotus purpuratus]

###Gene_Info_Comments GLEAN3_07139 ###
PREDICTED: hypothetical protein
PREDICTED: similar to heparan sulfate sulfotransferase [Strongylocentrotus purpuratus]

###Gene_Info_Comments GLEAN3_07140 ###
PREDICTED: similar to WD repeat domain 66 [Strongylocentrotus purpuratus]
###Gene_Info_Comments GLEAN3_08948 ###
PREDICTED: similar to xanthine dehydrogenase [Strongylocentrotus purpuratus]
###Gene_Info_Comments GLEAN3_08949 ###
PREDICTED: similar to codanin 1 [Gallus gallus]

###Gene_Info_Comments GLEAN3_08950 ###
PREDICTED: similar to opsin [Strongylocentrotus purpuratus]
###Gene_Info_Comments GLEAN3_08951 ###
PREDICTED: KIAA0586 isoform 7 [Pan troglodytes]
###Gene_Info_Comments GLEAN3_08952 ###
hypothetical protein LOC324723 [Danio rerio]
###Gene_Info_Comments GLEAN3_08955 ###
PREDICTED: similar to small zinc finger-like protein [Strongylocentrotus purpuratus], also similar to G protein-coupled receptor 1 [Strongylocentrotus purpuratus]


###Gene_Info_Comments GLEAN3_08956 ###
PREDICTED: similar to neurexin iv [Strongylocentrotus purpuratus]
###Gene_Info_Comments GLEAN3_08957 ###
PREDICTED: similar to ligand Delta-1, partial [Strongylocentrotus purpuratus]

###Gene_Info_Comments GLEAN3_08958 ###
PREDICTED: hypothetical protein [Strongylocentrotus purpuratus]

###Gene_Info_Comments GLEAN3_08960 ###
PREDICTED: similar to CG4089-PA [Tribolium castaneum]
###Gene_Info_Comments GLEAN3_27146 ###
PREDICTED: hypothetical protein isoform 1 [Strongylocentrotus purpuratus]
###Gene_Info_Comments GLEAN3_23395 ###
PREDICTED: similar to putative urease accessory protein F [Strongylocentrotus purpuratus]

###Gene_Info_Comments GLEAN3_23396 ###
similar to fibrosurfin, partial [Strongylocentrotus 
purpuratus]

###Gene_Info_Comments GLEAN3_23399 ###
PREDICTED: similar to hyalin [Strongylocentrotus purpuratus]
###Gene_Info_Comments GLEAN3_00994 ###
PREDICTED: similar to alpha macroglobulin [Strongylocentrotus purpuratus].
###Gene_Info_Comments GLEAN3_00996 ###
PREDICTED: hypothetical protein [Strongylocentrotus purpuratus], similar to tumor protein p53 inducible protein 11, isoform CRA_b [Homo sapiens]

###Gene_Info_Comments GLEAN3_02116 ###
PREDICTED: hypothetical protein [Strongylocentrotus purpuratus]. also similar to Rap2-binding protein 9 isoform 4 [Macaca mulatta].

###Gene_Info_Comments Sp-185/333-07 ###
This gene goes out of frame in element 2, and therefore may be either a pseudogene or a problem in the assembly.
###Gene_Info_Comments GLEAN3_00031 ###
SP, lec, PAN? kringle, lec
###Gene_Info_Comments GLEAN3_00141 ###
lec, EGF-like (epidermal-growth-factor like domain), laminin-EGF, fibronectin 3, TM
###Gene_Info_Comments GLEAN3_00192 ###
lec
###Gene_Info_Comments GLEAN3_00271 ###
EGF(epidermal growth factor like domain), lec
###Gene_Info_Comments GLEAN3_00289 ###
Clec X 2
###Gene_Info_Comments GLEAN3_00290 ###
Clec X 3 (or 2.5)
###Gene_Info_Comments GLEAN3_00542 ###
lec, PAN
###Gene_Info_Comments GLEAN3_00543 ###
lec, EGF (epidermal growth factor like domain), EGF(epidermal growth factor like domain), TM, cyt; macrophage mannose binding lec?
###Gene_Info_Comments GLEAN3_00837 ###
SP, lec
###Gene_Info_Comments GLEAN3_01027 ###
low complexity, Clec X 2, low complexity, EGF (epidermal growth factor like domain)-like X 3, TM, low complexity
###Gene_Info_Comments GLEAN3_01200 ###
EGF, EGF, EGF, EGF, EGF, EGF, EGF, EGF, lec, EGF
*EGF = Epidermal Growth Factor like domain
###Gene_Info_Comments GLEAN3_01527 ###
SP, lec, hyalin, lec, hyalin, hyalin, apple domain (plasminogen)
###Gene_Info_Comments GLEAN3_01878 ###
gal_lec, transmembrane
###Gene_Info_Comments GLEAN3_01887 ###
EGF (Epidermal growth factor like domain, lec
###Gene_Info_Comments GLEAN3_01898 ###
lec apple domain
###Gene_Info_Comments GLEAN3_02005 ###
Gal_lec
###Gene_Info_Comments GLEAN3_02383 ###
SP, lec
###Gene_Info_Comments GLEAN3_02420 ###
Clec X 2, low complexity, LY(Low-density lipoprotein-receptor YWTD domain) X 2
###Gene_Info_Comments GLEAN3_02718 ###
lec, CCP(Domain abundant in complement control proteins; SUSHI repeat; short complement-like repeat (SCR)), TM
###Gene_Info_Comments GLEAN3_02720 ###
SP, gal_lec, coiled coil, TM, TM, TM, TM, TM, TM, TM, low complexity region
###Gene_Info_Comments GLEAN3_02861 ###
EGF(epidermal growth factor like domain), hyalin, hyalin, lec
###Gene_Info_Comments GLEAN3_03246 ###
SP, gal_lec, transmembrane, low complexity
###Gene_Info_Comments GLEAN3_03284 ###
Kringle, lec, EGF(epidermal growth factor like domain)-CA(cadherin repeats)
###Gene_Info_Comments GLEAN3_03610 ###
Sp, Clec X 3, TM
###Gene_Info_Comments GLEAN3_04014 ###
lec
###Gene_Info_Comments GLEAN3_04476 ###
hyalin, hyalin, lec
###Gene_Info_Comments GLEAN3_04818 ###
Clec X 2
###Gene_Info_Comments GLEAN3_04831 ###
low complexity, gal_lec
###Gene_Info_Comments GLEAN3_05013 ###
SP, CCP, lec, CCP, CCP, low complexity, TM

*CCP = Domain abundant in complement control proteins; SUSHI repeat; short complement-like repeat (SCR)
###Gene_Info_Comments GLEAN3_05028 ###
lec, LDLA, LDLA

*LDLA = Low-density lipoprotein receptor domain class A
###Gene_Info_Comments GLEAN3_05248 ###
lec, CCP (Low-density lipoprotein receptor domain class A), TM, cyt
###Gene_Info_Comments GLEAN3_05596 ###
EGF (epidermal growth factor like domain), lec, lec
###Gene_Info_Comments GLEAN3_05706 ###
SP, EGF (Epidermal growth factor like domain), hyalin, lec
###Gene_Info_Comments GLEAN3_05725 ###
CUB, coag factor 5/8 C terminal domain, lec, LDLa, LDLa, LDLa, space,  LDLa
*CUB = Domain first found in C1r, C1s, uEGF, and bone morphogenetic protein.
*LDLA = Low-density lipoprotein receptor domain class A
###Gene_Info_Comments GLEAN3_05888 ###
Clec X 2
###Gene_Info_Comments GLEAN3_05909 ###
CCP, CCP, lec, CCP, CCP, EGF

* CCP = Domain abundant in complement control proteins; SUSHI repeat; short complement-like repeat (SCR)
###Gene_Info_Comments GLEAN3_05910 ###
lec C-type, CCP, CCP, TM, cyt

*CCP = Domain abundant in complement control proteins; SUSHI repeat; short complement-like repeat (SCR)
###Gene_Info_Comments GLEAN3_06083 ###
lec, hyalin, hyalin, hyalin, hyalin, apple domain
###Gene_Info_Comments GLEAN3_06306 ###
gal_lec, gal_lec
###Gene_Info_Comments GLEAN3_06310 ###
EGF(epidermal growth factor link domain), lec
###Gene_Info_Comments GLEAN3_06620 ###
apple, lec
###Gene_Info_Comments GLEAN3_06648 ###
Clec, CCP, CCP, low complexity, TM

*CCP = Domain abundant in complement control proteins; SUSHI repeat; short complement-like repeat (SCR)
###Gene_Info_Comments GLEAN3_06649 ###
SP,  CCP, CCP, lec

*CCP = Domain abundant in complement control proteins; SUSHI repeat; short complement-like repeat (SCR)
###Gene_Info_Comments GLEAN3_07118 ###
CCP, lec, CCP TM

* CCP = Domain abundant in complement control proteins; SUSHI repeat; short complement-like repeat (SCR)
###Gene_Info_Comments GLEAN3_07565 ###
gal_lec
###Gene_Info_Comments GLEAN3_07566 ###
low complex, lec
###Gene_Info_Comments GLEAN3_08001 ###
lec
###Gene_Info_Comments GLEAN3_08065 ###
SP,  lec, EGF,EGF, protein tyr phosphatase or fibronectin3

*EGF = Epidermal Growth Factor like Domain
###Gene_Info_Comments GLEAN3_08976 ###
CCP, lec

*CCP = Domain abundant in complement control proteins; SUSHI repeat; short complement-like repeat (SCR)
###Gene_Info_Comments GLEAN3_09174 ###
SP, EGF (epidermal growth factor like domain), hyalin, hyalin,  lec
###Gene_Info_Comments GLEAN3_09222 ###
CCP, lec, lec, CCP, CCP, EGF, EGF

*CCP = Domain abundant in complement control proteins; SUSHI repeat; short complement-like repeat (SCR)

*EGF = epidermal growth factor like domain
###Gene_Info_Comments GLEAN3_09615 ###
lec
###Gene_Info_Comments GLEAN3_09886 ###
CLECT X 4
###Gene_Info_Comments GLEAN3_10019 ###
SP, gal_lec, transmembrane
###Gene_Info_Comments GLEAN3_10099 ###
Sp, lec
###Gene_Info_Comments GLEAN3_10313 ###
gal_lec, gal_lec, gal_lec
###Gene_Info_Comments GLEAN3_10317 ###
Clec X 2, low complexity
###Gene_Info_Comments GLEAN3_10470 ###
Clec X 2, PAN_1
###Gene_Info_Comments GLEAN3_10615 ###
SP, lec, PAN?
###Gene_Info_Comments GLEAN3_10639 ###
PAN apple, lec, PAN apple, TM
###Gene_Info_Comments GLEAN3_11014 ###
F5F8 type C, C_lec, gal_lec, C_lec, FA58C(Coagulation factor 5/8 C-terminal domain, discoidin domain), Protein has FTP (eel-Fucolectin Tachylectin-4 Pentaxrin-1 Domain) at relatively low probability

###Gene_Info_Comments GLEAN3_11167 ###
lec, CCP, CCP, EGF, EGF, TM

*CCP = Domain abundant in complement control proteins; SUSHI repeat; short complement-like repeat (SCR)
*EGF = epidermal growth factor like domain
###Gene_Info_Comments GLEAN3_11176 ###
SCOP or Clectin, low complexity, GPS (G-protein-coupled receptor proteolytic site domain), 7TM-2
###Gene_Info_Comments GLEAN3_11192 ###
EGF-Ca, EGF-Ca, EGF-Ca, EGF-Ca, EGF-Ca, EGF-Ca, EGF-Ca, EGF-Ca, EGF-Ca, EGF-Ca, lec

*EGF(epidermal growth factor link domain)-Ca(Cadherin repeats)
###Gene_Info_Comments GLEAN3_11632 ###
lec
###Gene_Info_Comments GLEAN3_11829 ###
Sp, Clec X 3, PAN 1
###Gene_Info_Comments GLEAN3_11867 ###
EGF (epidermal growth factor like domain) X 2, FTP(eel-Fucolectin Tachylectin-4 Pentaxrin-1 Domain)
###Gene_Info_Comments GLEAN3_11971 ###
SP, CUB, CUB,CUB, lec, LDLa, LDLa, LDLa, LDLa, LDLa, LDLa

*CUB = Domain first found in C1r, C1s, uEGF, and bone morphogenetic protein.

*LEDLA = Low-density lipoprotein receptor domain class A
###Gene_Info_Comments GLEAN3_12121 ###
CCP, CCP, lec, CCP CCO, EGF-Ca, EGF-Ca, EGF-Ca, TM

*CCP = Domain abundant in complement control proteins; SUSHI repeat; short complement-like repeat (SCR)

*EGF-Ca = Epidermal Growth factor like domain - Cadherin repeats.

###Gene_Info_Comments GLEAN3_12302 ###
low complexity, gal_lec
###Gene_Info_Comments GLEAN3_12477 ###
CLec X 5, PAN_AP (divergent subfamily of APPLE domains), CLec X 3
###Gene_Info_Comments GLEAN3_12478 ###
Clec, PAN_AP(divergent subfamily of APPLE domains), Clec, Clec, Clec
###Gene_Info_Comments GLEAN3_12479 ###
Sp, Clec X 4, TM
###Gene_Info_Comments GLEAN3_12585 ###
lec, CCP, CCP, TM

*CCP = Domain abundant in complement control proteins; SUSHI repeat; short complement-like repeat (SCR)
###Gene_Info_Comments GLEAN3_12742 ###
Clec X 2, CCP, EGF(epidermal growth factor like domian)-like X 2, TM
###Gene_Info_Comments GLEAN3_12837 ###
TM, lec
###Gene_Info_Comments GLEAN3_12869 ###
CUB, CUB, CUB, CUB, CUB or lec, LDLa, LDLa, LDLa, LDLa, TM, TM, TM, TM  TM, TM, TM 

*CUB = Domain first found in C1r, C1s, uEGF, and bone morphogenetic protein.

*LDLA = Low-density lipoprotein receptor domain class A
###Gene_Info_Comments GLEAN3_12951 ###
PAN, FA5/8C, FA5/8C, FA5/8C, lec, CUB, CUB, CUB, lec

*FA58C = Coagulation factor 5/8 C-terminal domain, discoidin domain
*CUB = Domain first found in C1r, C1s, uEGF, and bone morphogenetic protein.
###Gene_Info_Comments GLEAN3_13186 ###
SP,  CCP, CCP, leclec

*CCP = Domain abundant in complement control proteins; SUSHI repeat; short complement-like repeat (SCR)
###Gene_Info_Comments GLEAN3_13388 ###
lec
###Gene_Info_Comments GLEAN3_13825 ###
lec, low complex
###Gene_Info_Comments GLEAN3_13860 ###
Sp, lec, lec
###Gene_Info_Comments GLEAN3_13872 ###
Sp, lec,lec

###Gene_Info_Comments GLEAN3_14004 ###
SP, low complexity, gal_lec
###Gene_Info_Comments GLEAN3_14097 ###
SRCR , lec, EGF (Epidermal Growth Factor Like Domain)

###Gene_Info_Comments GLEAN3_14218 ###
gal_lec, EGF_like, EGF_like, transmembrane

*EGF = epidermal growth factor like domain
###Gene_Info_Comments GLEAN3_14219 ###
lec, low complelx, TM, long cyt
###Gene_Info_Comments GLEAN3_14220 ###
Low complexity, TM
Alternativer: Clec below threshold match

###Gene_Info_Comments GLEAN3_14585 ###
(FTP(eel-Fucolectin Tachylectin-4 Pentaxrin-1 Domain) domain) F5F8_type_C(Coagulation factor 5/8 C-terminal domain, discoidin domain), gal_lec, gal_lec, gal_lec,
###Gene_Info_Comments GLEAN3_14623 ###
lec, PAN
###Gene_Info_Comments GLEAN3_14672 ###
CUB, lec, low complex
###Gene_Info_Comments GLEAN3_15065 ###
Clec X 2, PAN_1
###Gene_Info_Comments GLEAN3_15487 ###
Sp, Lec
###Gene_Info_Comments GLEAN3_15726 ###
lec
###Gene_Info_Comments GLEAN3_15961 ###
lec, pan
###Gene_Info_Comments GLEAN3_15962 ###
Clec X 2, PAN 1
###Gene_Info_Comments GLEAN3_16088 ###
DUF1339,lec,TM
###Gene_Info_Comments GLEAN3_16103 ###
SP,  CCP, CCP, lec

*CCP = Domain abundant in complement control proteins; SUSHI repeat; short complement-like repeat (SCR)
###Gene_Info_Comments GLEAN3_16529 ###
CCP, CCP, lec, low complexity region
*CCP = Domain abundant in complement control proteins; SUSHI repeat; short complement-like repeat (SCR)
###Gene_Info_Comments GLEAN3_16629 ###
Clec, CCP X 3, EGF (Epidermal growth factor like domain)-like, CCP, TM

*CCP = Domain abundant in complement control proteins; SUSHI repeat; short complement-like repeat (SCR)
###Gene_Info_Comments GLEAN3_16681 ###
CCP, lec, CCP, CCP, EGF(Epidermal Growth Factor like domain)/Ca X34, TM, short cyt

*CCP = Domain abundant in complement control proteins; SUSHI repeat; short complement-like repeat (SCR)


###Gene_Info_Comments GLEAN3_16771 ###
lec


###Gene_Info_Comments GLEAN3_16772 ###
lec
###Gene_Info_Comments GLEAN3_17006 ###
CCP, CCP, CCP, lec, CCP, CCP, CCP, TM cyt

*CCP = Domain abundant in complement control proteins; SUSHI repeat; short complement-like repeat (SCR)


###Gene_Info_Comments GLEAN3_17007 ###
SP,  CCP, CCP, CCP, lec, CCP, TM cyt

*CCP = Domain abundant in complement control proteins; SUSHI repeat; short complement-like repeat (SCR)
###Gene_Info_Comments GLEAN3_17109 ###
Gal_lec
###Gene_Info_Comments GLEAN3_17519 ###
CCP(Domain abundant in complement control proteins; SUSHI repeat; short complement-like repeat (SCR)), lec
###Gene_Info_Comments GLEAN3_17520 ###
lec, CCP, CCP

*CCP = Domain abundant in complement control proteins; SUSHI repeat; short complement-like repeat (SCR)
###Gene_Info_Comments GLEAN3_17842 ###
lec overlaps with VWF type C
###Gene_Info_Comments GLEAN3_17887 ###
Sp, Clec, PAN_AP(divergent subfamily of APPLE domains), Clec X3, SCOP
###Gene_Info_Comments GLEAN3_18258 ###
lec, CCP, CCP, EGF(epidermal growth factor link domain)

*CCP = Domain abundant in complement control proteins; SUSHI repeat; short complement-like repeat (SCR)
###Gene_Info_Comments GLEAN3_18501 ###
low complexity, Clec X 2
###Gene_Info_Comments GLEAN3_18550 ###
SP, gal_lec, gal_lec, low complexity
###Gene_Info_Comments GLEAN3_18601 ###
lec, EGF(epidermal growth factor domain
###Gene_Info_Comments GLEAN3_18682 ###
low complexity, gal_lec, gal_lec
###Gene_Info_Comments GLEAN3_18683 ###
gal_lec, gal_lec
###Gene_Info_Comments GLEAN3_18940 ###
Clec X4, TM
###Gene_Info_Comments GLEAN3_19060 ###
lec, low complexity
###Gene_Info_Comments GLEAN3_19088 ###
SP, lec
###Gene_Info_Comments GLEAN3_19150 ###
lec, LDLa, LDLa

*LDLDA = Low-density lipoprotein receptor domain class A
###Gene_Info_Comments GLEAN3_19438 ###
gal_lec, gal_lec, PAN
###Gene_Info_Comments GLEAN3_19576 ###
SP, lec, EGF, EGF

*EGF = epidermal growth factor link domain
###Gene_Info_Comments GLEAN3_19805 ###
Clec, Clec, low complexity
###Gene_Info_Comments GLEAN3_19986 ###
SP,  lec, EGF, EGF

*EGF = Epidermal growth factor like domain
###Gene_Info_Comments GLEAN3_20377 ###
lec, PAN
###Gene_Info_Comments GLEAN3_20424 ###
 low complexity, lec, lec
###Gene_Info_Comments GLEAN3_20513 ###
CCP(Domain abundant in complement control proteins; SUSHI repeat; short complement-like repeat (SCR)), lec
###Gene_Info_Comments GLEAN3_20760 ###
hyalin, hyalin, lec
###Gene_Info_Comments GLEAN3_21221 ###
low complexity, Clec X 2, low complexity
###Gene_Info_Comments GLEAN3_21505 ###
Clec X 2, low complexity regions
###Gene_Info_Comments GLEAN3_21675 ###
low complexity, gal_lec
###Gene_Info_Comments GLEAN3_21709 ###
SP, lec, space, PAN
###Gene_Info_Comments GLEAN3_22197 ###
SP, CUB, CUB, CUB, lec, TM

*CUB = Domain first found in C1r, C1s, uEGF, and bone morphogenetic protein.
###Gene_Info_Comments GLEAN3_22470 ###
SP, EGF-Ca, EGF-Ca, EGF-Ca, EGF-Ca, EGF-Ca, EGF-Ca, EGF-Ca, EGF-Ca, EGF-Ca, EGF-Ca, EGF-Ca, EGF-Ca, EGF-Ca, EGF-Ca, Clec, EGF-Ca, EGF-Ca, EGF-Ca, EGF-Ca, EGF-Ca, EGF-Ca, EGF-Ca, EGF-Ca, EGF-Ca, EGF-Ca, EGF-Ca, EGF-Ca, EGF-Ca

* EGF = epidermal growth factor like domain
*CA = Cadherin repeats.
###Gene_Info_Comments GLEAN3_22595 ###
low complexity, gal_lec
###Gene_Info_Comments GLEAN3_22718 ###
SP,F_raikovi_mat, gal_lec, EGF_like, transmembrane
###Gene_Info_Comments GLEAN3_22719 ###
SP, gal_lec
###Gene_Info_Comments GLEAN3_23114 ###
Clec X 2, several low complexity regions
###Gene_Info_Comments GLEAN3_23130 ###
lec, long low complexity
###Gene_Info_Comments GLEAN3_23175 ###
lec overlap with LDLa, lec overlap with 2
 LDLa, LDLa

*LDLA = Low-density lipoprotein receptor domain class A
###Gene_Info_Comments GLEAN3_23607 ###
EGF-Ca, lec, EGF-Ca, EGF-Ca, lec, EGF-Ca, lec, EGF-Ca, lec, EGF-Ca, lec, EGF-Ca, lec

*EGF = epidermal growth factor like domain 
*CA = Cadherin repeats.
###Gene_Info_Comments GLEAN3_23714 ###
Sp, Clec X 2
###Gene_Info_Comments GLEAN3_24218 ###
CCP X 2, CLec X 2, CCP

* CCP = Domain abundant in complement control proteins; SUSHI repeat; short complement-like repeat (SCR)
###Gene_Info_Comments GLEAN3_24494 ###
lec, low complexity
###Gene_Info_Comments GLEAN3_24638 ###
Clec X 3
###Gene_Info_Comments GLEAN3_24786 ###
FA5/8C, FA5/8C, lec, EGF-Ca, lec C-type, CUB, CUB, CUB, TM

*FA5/8C = Coagulation factor 5/8 C-terminal domain, discoidin domain

*EGF-Ca = Epidermal growth factor like domain - Cadherin repeats.

*CUB = Domain first found in C1r, C1s, uEGF, and bone morphogenetic protein.
###Gene_Info_Comments GLEAN3_25051 ###
Clec X 3, TM, low complexity
###Gene_Info_Comments GLEAN3_25074 ###
lec, lec
###Gene_Info_Comments GLEAN3_25097 ###
F5/F8 type C (Coagulation factor 5/8 C-terminal domain, discoidin domain), lec, lec C-type, CUB (Domain first found in C1r, C1s, uEGF, and bone morphogenetic protein.)
###Gene_Info_Comments GLEAN3_25194 ###
SP, CCP, CCP, CCP, TM, CCP, CCP, CCP, lec, EGF overlaps with histone deacetylase interacting domain

*CCP = Domain abundant in complement control proteins; SUSHI repeat; short complement-like repeat (SCR)
*EGF = Epidermal Growth Factor like domain
###Gene_Info_Comments GLEAN3_25414 ###
SP, gal_lec, gal_lec, gal_lec, gal_lec
###Gene_Info_Comments GLEAN3_26103 ###
CCP, CCP, lec

*CCP = Domain abundant in complement control proteins; SUSHI repeat; short complement-like repeat (SCR)
###Gene_Info_Comments GLEAN3_26208 ###
lec, CCP, CCP
*CCP = Domain abundant in complement control proteins; SUSHI repeat; short complement-like repeat (SCR)
###Gene_Info_Comments GLEAN3_27084 ###
CUB (Domain first found in C1r, C1s, uEGF, and bone morphogenetic protein.) overlaps with lec, LDLa, LDLa, LDLa, LDLa, LDLa

*LDLA = Low-density lipoprotein receptor domain class A
###Gene_Info_Comments GLEAN3_27332 ###
SP, lec, long space, hormone receptor domain with many Cys, TM, TM, TM, TM, TM, TM, TM, long cyt
###Gene_Info_Comments GLEAN3_28067 ###
SP,  CCP, lec, TM, cyt
###Gene_Info_Comments GLEAN3_28298 ###
SP,  CCP, CCP, lec

*CCP = Domain abundant in complement control proteins; SUSHI repeat; short complement-like repeat (SCR)
###Gene_Info_Comments GLEAN3_28326 ###
CUB (Domain first found in C1r, C1s, uEGF, and bone morphogenetic protein.), lec, LDLa (Low-density lipoprotein receptor domain class A)

###Gene_Info_Comments GLEAN3_28539 ###
Sp, Clec, Kring X 3, Clec X 6, HDAC_interact(Histone deacetylase (HDAC) interacting), SCOP
###Gene_Info_Comments GLEAN3_28565 ###
Sp, lec
###Gene_Info_Comments GLEAN3_28712 ###
CLec, Fibrinogen, PAN_AP(divergent subfamily of APPLE domains), CLec
###Gene_Info_Comments GLEAN3_28846 ###
lec, CCP (Domain abundant in complement control proteins; SUSHI repeat; short complement-like repeat (SCR)), TM, cyt
###Gene_Info_Comments GLEAN3_14450 ###
CCP, lec, CCP

*CCP = Domain abundant in complement control proteins; SUSHI repeat; short complement-like repeat (SCR)
###Gene_Info_Comments GLEAN3_19008 ###
dystrophin-like protein [Strongylocentrotus purpuratus]
,3908 aa

###Gene_Info_Comments GLEAN3_19010 ###
hypothetical protein isoform 1 [Strongylocentrotus 
purpuratus]
###Gene_Info_Comments GLEAN3_10580 ###
similar to ubiquitin-conjugating enzyme E2D 3 [Strongylocentrotus purpuratus]
###Gene_Info_Comments GLEAN3_10581 ###
similar to cathepsin l [Strongylocentrotus purpuratus], 821 aa
###Gene_Info_Comments GLEAN3_10582 ###
similar to transcriptional intermediary factor 1 alpha 
[Strongylocentrotus purpuratus],220 aa
###Gene_Info_Comments GLEAN3_09812 ###
2OG-Fe(II) oxygenase domain
###Gene_Info_Comments GLEAN3_16862 ###
frizzled domain, CCP X2, CLECT, CCP
###Gene_Info_Comments GLEAN3_23863 ###
GLEAN3_23863 is a partial duplicate prediction for GLEAN3_28180
###Gene_Info_Comments GLEAN3_25316 ###
Domains: low complexity, Clec X 2, low complexity, Clec X 2, low complexity
###Gene_Info_Comments GLEAN3_14701 ###
Dynein heavy chain, N-terminal region 1. Dynein heavy chains interact with other heavy chains to form dimers, and with intermediate chain-light chain complexes to form a basal cargo binding unit. The region featured in this family includes the sequences implicated in mediating these interactions. It is thought to be flexible and not to adopt a rigid conformation.
###Gene_Info_Comments GLEAN3_00239 ###
Many eukaryotic proteins that are known or supposed to bind single-stranded RNA contain one or more copies of a putative RNA-binding domain of about 90 amino acids. This is known as the eukaryotic putative RNA-binding region RNP-1 signature or RNA recognition motif (RRM). RRMs are found in a variety of RNA binding proteins, including heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs). The motif also appears in a few single stranded DNA binding proteins. The RRM structure consists of four strands and two helices arranged in an alpha/beta sandwich, with a third helix present during RNA binding in some cases. Two individual models were built which identify subtypes of this domain, but there is no functional difference between the subtypes. 
###Gene_Info_Comments GLEAN3_27751 ###
Among the different families of transporter only two occur ubiquitously in all classifications of organisms. These are the ATP-Binding Cassette (ABC) superfamily and the Major Facilitator Superfamily (MFS). The MFS transporters are single-polypeptide secondary carriers capable only of transporting small solutes in response to chemiosmotic ion gradients
###Gene_Info_Comments GLEAN3_23347 ###
Ras proteins are membrane-associated molecular switches that bind GTP and GDP and slowly hydrolyze GTP to GDP. The balance between the GTP bound (active) and GDP bound (inactive) states is regulated by the opposite action of proteins activating the GTPase activity and that of proteins which promote the loss of bound GDP and the uptake of fresh GTP. The latter proteins are known as guanine-nucleotide dissociation stimulators (GDSs) (or also as guanine-nucleotide releasing (or exchange) factors (GRFs)). Proteins that act as GDS can be classified into at least two families, on the basis of sequence similarities, the CDC24 family (see IPR001331) and the CDC25 family.
The size of the proteins of the CDC25 family range from 309 residues (LTE1) to 1596 residues (sos). The sequence similarity shared by all these proteins is limited to a region of about 250 amino acids generally located in their C-terminal section (currently the only exceptions are sos and ralGDS where this domain makes up the central part of the protein). This domain has been shown, in CDC25 an SCD25, to be essential for the activity of these proteins.

###Gene_Info_Comments GLEAN3_18998 ###
Hormone receptor (7 transmember receptor) 
###Gene_Info_Comments GLEAN3_18997 ###
Hormone receptor (7 Transmember protein)
###Gene_Info_Comments GLEAN3_10716 ###
ATP-binding cassette (ABC) transporters are multidomain membrane proteins, responsible for the controlled efflux and influx of substances (allocrites) across cellular membranes. They are minimally composed of four domains, with two transmembrane domains (TMDs) responsible for allocrite binding and transport and two nucleotide-binding domains (NBDs) responsible for coupling the energy of ATP hydrolysis to conformational changes in the TMDs. Both NBDs are capable of ATP hydrolysis, and inhibition of hydrolysis at one NBD effectively abrogates hydrolysis at the other. Hydrolysis at the two NBDs may occur in an alternative fashion although they appear substantially functionally symmetrical in terms of their binding to diverse nucleotides
###Gene_Info_Comments GLEAN3_10968 ###
Epidermal growth factors and transforming growth factors belong to a general class of proteins that share a repeat pattern involving a number of conserved Cys residues. Growth factors are involved in cell recognition and division. The repeating pattern, especially of cysteines (the so-called EGF repeat), is thought to be important to the 3D structure of the proteins, and hence its recognition by receptors and other molecules. The type 1 EGF signature includes six conserved cysteines believed to be involved in disulphide bond formation. The EGF motif is found frequently in nature, particularly in extracellular proteins.
###Gene_Info_Comments GLEAN3_05786 ###
Zinc finger domains are nucleic acid-binding protein structures first identified in the Xenopus laevis transcription factor TFIIIA. These domains have since been found in numerous nucleic acid-binding proteins. A zinc finger domain is composed of 25 to 30 amino-acid residues including 2 conserved Cys and 2 conserved His residues in a C-2-C-12-H-3-H type motif. The 12 residues separating the second Cys and the first His are mainly polar and basic, implicating this region in particular in nucleic acid binding. The zinc finger motif is an unusually small, self-folding domain in which Zn is a crucial component of its tertiary structure. All bind 1 atom of Zn in a tetrahedral array to yield a finger-like projection, which interacts with nucleotides in the major groove of the nucleic acid. The Zn binds to the conserved Cys and His residues. Fingers have been found to bind to about 5 base pairs of nucleic acid containing short runs of guanine residues. They have the ability to bind to both RNA and DNA, a versatility not demonstrated by the helix-turn-helix motif. The zinc finger may thus represent the original nucleic acid binding protein. It has also been suggested that a Zn-centred domain could be used in a protein interaction, e.g. in protein kinase C. Many classes of zinc fingers are characterized according to the number and positions of the histidine and cysteine residues involved in the zinc atom coordination. In the first class to be characterized, called C2H2, the first pair of zinc coordinating residues are cysteines, while the second pair are histidines.
This THAP domain is a putative DNA-binding domain with a C2CH architecture that probably binds a zinc ion. The domain is widespread in Drosophila species, Mus musculus, Homo sapiens and has been reported in Caenorhabditis elegans.

###Gene_Info_Comments GLEAN3_10967 ###
The CUB domain is an extracellular domain of approximately 110 residues which is found in functionally diverse, mostly developmentally regulated proteins and in peptidases belonging to MEROPS peptidase families M12A (astacin) and S1A (chymotrypsin). Almost all CUB domains contain four conserved cysteines which probably form two disulphide bridges (C1-C2, C3-C4). The structure of the CUB domain has been predicted to be a beta-barrel similar to that of immunoglobulins. Proteins that have been found to contain the CUB domain include mammalian complement subcomponents C1s/C1r, which form the calcium-dependent complex C1, the first component of the classical pathway of the complement system; hamster serine protease Casp, which degrades type I and IV collagen and fibronectin in the presence of calcium; mammalian complement-activating component of Ra-reactive factor (RARF), a protease that cleaves the C4 component of complement; vertebrate enteropeptidase (EC 3.4.21.9), a type II membrane protein of the intestinal brush border, which activates trypsinogen; vertebrate bone morphogenic protein 1 (BMP-1), a protein which induces cartilage and bone formation and expresses metalloendopeptidase activity; sea urchins blastula proteins BP10 and SpAN; Caenorhabditis elegans hypothetical proteins F42A10.8 and R151.5; neuropilin (A5 antigen), a calcium-independent cell adhesion molecule that functions during the formation of certain neuronal circuits; fibropellins I and III from sea urchin; mammalian hyaluronate-binding protein TSG-6 (or PS4), a serum and growth factor induced protein; mammalian spermadhesins; and Xenopus embryonic protein UVS.2, which is expressed during dorsoanterior development.
###Gene_Info_Comments GLEAN3_14285 ###
This Glean was picked up from one dimensional gel            electrophoresis and mass spectrometry.  It appears to be missing ~300nt at the 5' end and ~200 nt at the 3' end compared to the best human hit.
###Gene_Info_Comments GLEAN3_28871 ###
This gene is most likely an allel of Glean3_07497
###Gene_Info_Comments GLEAN3_07497 ###
This gene has an allel in Glean3_28871
###Gene_Info_Comments GLEAN3_19286 ###
Two non-overlapping peptides were pulled from an IP coupled MS/MS from unfertilized eggs.
###Gene_Info_Comments GLEAN3_14951 ###
an SH2 domain pulled up from the Anger group mRNA expression database associated with SFK1...unclear if it is really expressed.  First ~400aa of sequence has homology to "solute carrier family 15", followed by a full SH2 domain and then a partial TyK domain.  Probably not a SFK. 
###Gene_Info_Comments GLEAN3_10845 ###
Looks to be an allel of Glean3_22112
###Gene_Info_Comments GLEAN3_23976 ###
Looks to be an allele of Glean3_04037.
###Gene_Info_Comments GLEAN3_23905 ###
C term of the prediction seems to be a piece of the Fringe protein
###Gene_Info_Comments GLEAN3_03256 ###
The GLEAN3 model shows little confidence and may thus be incorrect.  However, the sequence of the GLEAN model CDS is nearly identical to the probe used by Rast et. al. 2002 Dev. Biol. and should thus is likely the "Kakapo" discussed in that paper.  
###Gene_Info_Comments GLEAN3_14715 ###
May be a fragment of GLEAN3_00296 or GLEAN3_03985 (or another gene altogether) as this model is significantly shorter than either of these gelsolin-like genes.  However, this Glean model is nearly identical to the probe for Gelsolin used in Rast et. al.(2002) Dev Bio and suggests that this gene is expressed in the sea urchin genome in an area around the blastopore > 24 hour along with Kakapo and apobec
