Identification
HMDB Protein ID CDBP04226
Secondary Accession Numbers Not Available
Name Histone-lysine N-methyltransferase SETD1B
Description Not Available
Synonyms
  1. Lysine N-methyltransferase 2G
  2. SET domain-containing protein 1B
  3. hSET1B
Gene Name SETD1B
Protein Type Enzyme
Biological Properties
General Function Involved in nucleotide binding
Specific Function Histone methyltransferase that specifically methylates 'Lys-4' of histone H3, when part of the SET1 histone methyltransferase (HMT) complex, but not if the neighboring 'Lys-9' residue is already methylated. H3 'Lys-4' methylation represents a specific tag for epigenetic transcriptional activation. The non-overalpping localization with SETD1A suggests that SETD1A and SETD1B make non-redundant contributions to the epigenetic control of chromatin structure and gene expression. Specifically tri-methylates 'Lys-4' of histone H3 in vitro.
GO Classification
Biological Process
transcription, DNA-dependent
histone H3-K4 methylation
regulation of transcription, DNA-dependent
Cellular Component
chromosome
nuclear speck
Set1C/COMPASS complex
Function
binding
nucleotide binding
nucleic acid binding
Molecular Function
histone-lysine N-methyltransferase activity
RNA binding
nucleotide binding
Cellular Location
  1. Nucleus speckle
  2. Chromosome
Pathways
Gene Properties
Chromosome Location 12
Locus 12q24.31
SNPs SETD1B
Gene Sequence
>5772 bp
ATGGAGAACAGTCACCCCCCCCACCACCACCACCAGCAGCCCCCGCCGCAGCCCGGCCCT
TCGGGCGAGAGGAGGAACCACCATTGGAGAAGTTACAAGTTGATGATTGACCCGGCTCTG
AAAAAGGGGCATCATAAACTGTACCGCTACGATGGGCAGCATTTCAGCCTGGCGATGTCC
AGCAACCGCCCGGTGGAAATTGTCGAAGATCCCCGGGTCGTCGGGATCTGGACCAAAAAC
AAGGAGCTGGAGCTGTCGGTGCCCAAATTCAAGATCGATGAGTTCTACGTGGGCCCGGTG
CCTCCGAAGCAGGTGACATTTGCCAAGCTGAATGATAACATCCGTGAAAACTTCCTGAGG
GACATGTGCAAGAAGTATGGGGAGGTGGAGGAGGTGGAGATTTTGTACAACCCCAAGACC
AAGAAGCACCTGGGCATCGCCAAGGTGGTCTTTGCCACGGTCCGGGGAGCCAAGGATGCC
GTTCAGCACTTGCACAGCACTTCCGTCATGGGCAACATTATCCACGTGGAGCTGGACACC
AAAGGGGAAACCCGAATGCGGTTCTATGAACTGTTGGTCACTGGCCGATACACCCCCCAG
ACCCTCCCAGTGGGCGAGCTGGACGCTGTCTCTCCAATCGTGAATGAGACCCTGCAGCTG
TCAGATGCCCTGAAGCGCCTCAAGGATGGAGGCCTGTCTGCAGGCTGTGGCTCCGGCTCC
TCCTCTGTCACCCCCAATAGCGGTGGGACACCCTTCTCCCAGGACACAGCTTATTCCAGC
TGCCGCCTGGACACACCCAACTCCTATGGACAGGGCACCCCGCTCACACCGCGCCTGGGC
ACCCCTTTCTCACAGGACTCCAGCTACTCCAGCCGCCAGCCCACACCCTCATACCTCTTC
AGCCAGGACCCTGCAGTGACCTTCAAGGCCCGGCGCCACGAGAGCAAGTTCACGGACGCC
TACAACCGCCGCCACGAACATCATTATGTACACAATTCTCCCGCGGTCACTGCGGTGGCC
GGGGCCACAGCCGCTTTCCGGGGTTCCTCGGACCTCCCGTTCGGAGCAGTCGGCGGCACT
GGGGGCAGCAGCGGTCCCCCGTTCAAGGCTCAACCACAGGATTCAGCCACATTTGCCCAC
ACTCCACCACCCGCCCAAGCAACCCCTGCTCCTGGATTCAAGTCTGCTTTCTCTCCGTAT
CAGACCCCAGTGGCCCACTTCCCTCCACCCCCGGAAGAGCCCACCGCCACAGCCGCTTTT
GGGGCCCGCGACAGTGGGGAGTTCCGGAGGGCACCGGCGCCCCCACCCCTGCCACCTGCT
GAGCCTCTGGCCAAGGAGAAGCCAGGCACGCCACCCGGCCCGCCGCCCCCCGACACCAAC
AGCATGGAGCTGGGCGGCCGGCCCACCTTCGGCTGGAGTCCTGAGCCCTGTGACAGCCCT
GGCACGCCCACGCTGGAGTCGTCCCCTGCAGGGCCAGAGAAACCCCACGACAGCCTGGAC
TCGCGCATCGAGATGCTGCTGAAGGAGCAGCGCACCAAGCTGCTCTTCCTGAGGGAGCCG
GACTCGGACACCGAGCTGCAGATGGAGGGCAGCCCCATCTCCTCCTCCTCCTCCCAGCTC
TCCCCACTGGCCCCCTTTGGCACCAACTCCCAGCCAGGCTTCCGGGGCCCCACGCCCCCC
TCGTCACGCCCCTCCAGCACCGGCCTGGAGGATATCAGCCCAACACCCCTCCCAGACTCC
GACGAGGACGAGGAGCTCGACCTGGGCCTTGGGCCTCGGCCTCCACCTGAGCCAGGCCCC
CCGGACCCTGCTGGGCTTCTGAGCCAGACAGCTGAGGTGGCCTTGGACCTGGTTGGAGAC
AGAACCCCGACCTCAGAGAAGATGGATGAGGGCCAGCAGTCCTCAGGCGAGGACATGGAG
ATCTCGGATGACGAGATGCCCTCGGCCCCCATCACCAGCGCTGACTGCCCCAAGCCCATG
GTGGTGACCCCAGGAGCGGCAGCCGTGGCAGCCCCTTCTGTGCTAGCCCCAACCCTGCCG
CTGCCCCCGCCACCTGGCTTCCCCCCGCTGCCCCCCCCACCACCACCACCCCCACCGCAG
CCTGGCTTCCCCATGCCCCCACCGCTGCCCCCACCGCCGCCCCCACCCCCTCCAGCCCAC
CCTGCTGTGACAGTGCCCCCACCACCCTTGCCAGCGCCGCCTGGAGTCCCGCCCCCACCC
ATCCTGCCACCACTGCCCCCCTTTCCGCCGGGCCTGTTCCCTGTGATGCAGGTGGACATG
AGCCACGTGCTGGGTGGCCAGTGGGGCGGCATGCCCATGTCCTTCCAGATGCAAACGCAG
GTGCTCAGCCGGCTGATGACGGGCCAGGGCGCCTGCCCCTACCCGCCCTTCATGGCCGCT
GCGGCCGCCGCTGCCTCAGCTGGGCTCCAGTTTGTCAACCTGCCGCCCTACCGGGGCCCC
TTCTCCCTGAGCAACTCCGGCCCAGGCCGCGGGCAGCACTGGCCACCACTGCCCAAGTTT
GACCCGTCAGTGCCTCCACCAGGCTACATGCCACGCCAGGAGGACCCACACAAAGCCACG
GTGGATGGCGTCCTGCTGGTGGTCCTCAAAGAACTCAAGGCCATCATGAAGCGTGACCTG
AACCGCAAGATGGTGGAAGTGGTGGCTTTCCGGGCCTTTGACGAGTGGTGGGACAAGAAG
GAGCGGATGGCCAAGGCCTCGCTGACCCCGGTGAAGTCGGGCGAGCACAAGGACGAGGAC
AGGCCGAAGCCCAAGGACCGCATCGCCTCGTGCCTGCTGGAGTCATGGGGCAAGGGCGAG
GGCCTGGGCTACGAGGGCCTGGGCCTGGGCATTGGGCTGCGTGGGGCCATTCGCCTGCCC
TCCTTCAAGGTCAAGAGGAAGGAGCCACCAGACACCACCTCATCTGGCGACCAGAAGCGG
CTGCGGCCCTCGACCTCTGTGGATGAGGAAGATGAAGAGTCCGAGCGAGAGCGAGACCGG
GATATGGCAGACACCCCCTGTGAGCTCGCCAAGCGGGACCCCAAGGGCGTGGGTGTGCGG
CGGCGGCCGGCGCGGCCTCTGGAGCTGGACAGTGGTGGGGAGGAGGACGAGAAGGAGTCA
TTGTCGGAGGAACAGGAGAGCACCGAGGAGGAAGAGGAGGCGGAGGAGGAGGAGGAGGAG
GAAGATGACGACGATGACGACAGTGATGACCGGGACGAGTCTGAGAACGATGACGAGGAC
ACAGCCCTGTCAGAGGCGAGTGAGAAGGACGAAGGGGACTCGGATGAAGAGGAGACAGTG
AGCATTGTAACCTCCAAGGCCGAAGCCACGTCGTCCAGTGAGAGTTCCGAGTCTTCTGAG
TTTGAGTCAAGCTCCGAGTCCTCGCCCTCATCCTCGGAGGATGAGGAGGAGGTAGTGGCC
AGGGAAGAGGAGGAAGAAGAGGAGGAGGAGGAGATGGTGGCCGAGGAAAGCATGGCTTCT
GCAGGCCCTGAGGACTTTGAGCAGGACGGGGAGGAAGCGGCTCTGGCCCCGGGGGCACCT
GCAGTGGACTCGTTGGGCATGGAAGAGGAGGTGGACATCGAGACTGAGGCTGTGGCCCCT
GAGGAGCGGCCCTCCATGCTGGACGAGCCCCCCTTGCCTGTGGGTGTTGAAGAGCCAGCG
GACTCCAGGGAGCCGCCTGAGGAACCAGGCCTGAGCCAGGAAGGGGCCATGTTGCTGTCT
CCAGAGCCCCCTGCCAAGGAGGTGGAGGCTCGACCCCCATTGTCCCCTGAGCGAGCTCCA
GAACATGACCTGGAAGTGGAGCCGGAGCCCCCTATGATGCTCCCCTTGCCGCTGCAACCA
CCATTGCCGCCCCCACGACCACCCCGGCCACCCAGCCCACCGCCGGAGCCTGAGACCACA
GATGCCTCACACCCATCTGTCCCTCCGGAGCCCCTTGCCGAGGACCACCCCCCGCATACT
CCAGGCCTCTGTGGCAGCCTGGCCAAGTCGCAGAGCACAGAGACGGTGCCAGCCACACCA
GGCGGGGAGCCCCCGCTATCAGGGGGCAGCAGTGGCCTGTCCCTGAGCTCTCCGCAAGTG
CCCGGCAGCCCCTTCTCCTACCCAGCCCCGTCCCCTAGCTTGAGCAGTGGGGGCCTCCCT
CGGACACCTGGCCGGGACTTCAGCTTCACACCCACCTTCTCCGAGCCCAGCGGGCCCTTG
CTCCTGCCCGTCTGCCCACTCCCCACTGGCCGACGCGATGAACGCTCCGGGCCCCTGGCC
TCCCCGGTGCTCCTGGAGACGGGCCTGCCCCTCCCTCTGCCCCTTCCCCTGCCCTTGCCC
TTGGCATTGCCCGCCGTCTTGCGGGCCCAGGCTCGTGCGCCCACCCCGCTGCCACCCCTG
CTGCCCGCCCCCCTGGCCTCTTGCCCTCCCCCAATGAAGAGGAAGCCGGGCCGGCCCCGG
CGATCCCCACCATCTATGCTCTCCTTGGATGGGCCCTTGGTCCGACCACCAGCAGGGGCC
GCCCTTGGAAGGGAACTCCTGCTCCTGCCGGGCCAGCCACAGACCCCCGTCTTCCCCAGC
ACCCATGACCCCCGGACGGTGACCCTGGACTTCCGGAACGCGGGGATCCCAGCCCCTCCA
CCACCCCTTCCCCCCCAGCCACCCCCACCCCCACCTCCCCCACCTGTAGAGCCCACCAAG
CTGCCCTTTAAGGAGCTAGACAACCAGTGGCCCTCCGAGGCCATTCCTCCGGGCCCCCGT
GGGCGCGATGAGGTCACTGAGGAATACATGGAGTTGGCCAAGAGCCGGGGGCCGTGGCGC
CGGCCACCTAAGAAGCGCCATGAGGACCTGGTGCCACCTGCGGGCTCGCCCGAACTCTCG
CCACCCCAGCCCCTCTTCCGGCCCCGCTCGGAGTTTGAGGAGATGACCATCCTGTATGAC
ATCTGGAACGGTGGCATCGATGAGGAGGACATCCGCTTCCTGTGTGTCACCTACGAGCGA
CTGCTACAGCAGGACAATGGCATGGACTGGCTTAACGACACGCTCTGGGTCTACCATCCC
TCCACCAGCCTCTCTTCAGCTAAGAAGAAGAAACGGGACGATGGCATCCGCGAGCACGTG
ACGGGCTGTGCCCGCAGTGAGGGCTTCTACACCATCGACAAGAAGGACAAGCTCAGATAC
CTCAACAGCAGCCGTGCCAGCACCGATGAGCCCCCCGCAGACACCCAGGGCATGAGCATC
CCAGCACAGCCCCACGCCTCCACCCGGGCAGGCTCGGAGCGGCGTTCGGAGCAGCGCCGC
CTGCTGTCCTCCTTCACTGGCAGCTGTGACAGTGACCTGCTCAAGTTCAACCAGCTCAAG
TTCCGGAAGAAAAAGCTCAAGTTCTGCAAGAGCCACATTCACGACTGGGGCTTGTTCGCC
ATGGAGCCCATCGCGGCTGACGAGATGGTCATCGAGTACGTGGGCCAGAATATCCGTCAG
GTGATCGCAGACATGCGGGAGAAGCGTTATGAGGACGAGGGCATCGGGAGCAGCTACATG
TTCCGGGTGGACCATGACACCATCATCGACGCCACCAAGTGCGGCAACTTCGCGCGCTTC
ATCAACCACAGCTGCAACCCCAACTGCTATGCCAAGGTGATCACGGTGGAGTCACAGAAG
AAGATAGTCATCTACTCGAAGCAGCACATTAACGTCAATGAGGAGATTACCTATGACTAT
AAGTTCCCCATCGAGGACGTCAAGATCCCCTGCCTCTGTGGCTCCGAGAACTGCCGGGGG
ACCCTCAACTAG
Protein Properties
Number of Residues 1923
Molecular Weight 208729.73
Theoretical pI 4.954
Pfam Domain Function
Signals Not Available
Transmembrane Regions Not Available
Protein Sequence
>Histone-lysine N-methyltransferase SETD1B
MENSHPPHHHHQQPPPQPGPSGERRNHHWRSYKLMIDPALKKGHHKLYRYDGQHFSLAMS
SNRPVEIVEDPRVVGIWTKNKELELSVPKFKIDEFYVGPVPPKQVTFAKLNDNIRENFLR
DMCKKYGEVEEVEILYNPKTKKHLGIAKVVFATVRGAKDAVQHLHSTSVMGNIIHVELDT
KGETRMRFYELLVTGRYTPQTLPVGELDAVSPIVNETLQLSDALKRLKDGGLSAGCGSGS
SSVTPNSGGTPFSQDTAYSSCRLDTPNSYGQGTPLTPRLGTPFSQDSSYSSRQPTPSYLF
SQDPAVTFKARRHESKFTDAYNRRHEHHYVHNSPAVTAVAGATAAFRGSSDLPFGAVGGT
GGSSGPPFKAQPQDSATFAHTPPPAQATPAPGFKSAFSPYQTPVAHFPPPPEEPTATAAF
GARDSGEFRRAPAPPPLPPAEPLAKEKPGTPPGPPPPDTNSMELGGRPTFGWSPEPCDSP
GTPTLESSPAGPEKPHDSLDSRIEMLLKEQRTKLLFLREPDSDTELQMEGSPISSSSSQL
SPLAPFGTNSQPGFRGPTPPSSRPSSTGLEDISPTPLPDSDEDEELDLGLGPRPPPEPGP
PDPAGLLSQTAEVALDLVGDRTPTSEKMDEGQQSSGEDMEISDDEMPSAPITSADCPKPM
VVTPGAAAVAAPSVLAPTLPLPPPPGFPPLPPPPPPPPPQPGFPMPPPLPPPPPPPPPAH
PAVTVPPPPLPAPPGVPPPPILPPLPPFPPGLFPVMQVDMSHVLGGQWGGMPMSFQMQTQ
VLSRLMTGQGACPYPPFMAAAAAAASAGLQFVNLPPYRGPFSLSNSGPGRGQHWPPLPKF
DPSVPPPGYMPRQEDPHKATVDGVLLVVLKELKAIMKRDLNRKMVEVVAFRAFDEWWDKK
ERMAKASLTPVKSGEHKDEDRPKPKDRIASCLLESWGKGEGLGYEGLGLGIGLRGAIRLP
SFKVKRKEPPDTTSSGDQKRLRPSTSVDEEDEESERERDRDMADTPCELAKRDPKGVGVR
RRPARPLELDSGGEEDEKESLSEEQESTEEEEEAEEEEEEEDDDDDDSDDRDESENDDED
TALSEASEKDEGDSDEEETVSIVTSKAEATSSSESSESSEFESSSESSPSSSEDEEEVVA
REEEEEEEEEEMVAEESMASAGPEDFEQDGEEAALAPGAPAVDSLGMEEEVDIETEAVAP
EERPSMLDEPPLPVGVEEPADSREPPEEPGLSQEGAMLLSPEPPAKEVEARPPLSPERAP
EHDLEVEPEPPMMLPLPLQPPLPPPRPPRPPSPPPEPETTDASHPSVPPEPLAEDHPPHT
PGLCGSLAKSQSTETVPATPGGEPPLSGGSSGLSLSSPQVPGSPFSYPAPSPSLSSGGLP
RTPGRDFSFTPTFSEPSGPLLLPVCPLPTGRRDERSGPLASPVLLETGLPLPLPLPLPLP
LALPAVLRAQARAPTPLPPLLPAPLASCPPPMKRKPGRPRRSPPSMLSLDGPLVRPPAGA
ALGRELLLLPGQPQTPVFPSTHDPRTVTLDFRNAGIPAPPPPLPPQPPPPPPPPPVEPTK
LPFKELDNQWPSEAIPPGPRGRDEVTEEYMELAKSRGPWRRPPKKRHEDLVPPAGSPELS
PPQPLFRPRSEFEEMTILYDIWNGGIDEEDIRFLCVTYERLLQQDNGMDWLNDTLWVYHP
STSLSSAKKKKRDDGIREHVTGCARSEGFYTIDKKDKLRYLNSSRASTDEPPADTQGMSI
PAQPHASTRAGSERRSEQRRLLSSFTGSCDSDLLKFNQLKFRKKKLKFCKSHIHDWGLFA
MEPIAADEMVIEYVGQNIRQVIADMREKRYEDEGIGSSYMFRVDHDTIIDATKCGNFARF
INHSCNPNCYAKVITVESQKKIVIYSKQHINVNEEITYDYKFPIEDVKIPCLCGSENCRG
TLN
GenBank ID Protein 210032580
UniProtKB/Swiss-Prot ID Q9UPS6
UniProtKB/Swiss-Prot Entry Name SET1B_HUMAN
PDB IDs
GenBank Gene ID NM_015048.1
GeneCard ID SETD1B
GenAtlas ID SETD1B
HGNC ID HGNC:29187
References
General References Not Available