E Coli Protein to rRNA Attachment Site Maps

Background

The Ribosome of the E Coli is one of the most well understood larger organelles. The Noller Lab has produced beautiful images that document what we’ve found with crystallography.

I recommend going to their web page for a deep visualization of how it works:

Noller Lab

And here’s a fun gif animation…

By Bensaccount at en.wikipedia, CC BY 3.0, https://commons.wikimedia.org/w/index.php?curid=8287100

See where the mRNA enters and leaves the small subunit and where the amino acid chain, the final protein leaves the top of the large subunit. It switches at some point to exporting that protein.

Here’s the rRNA secondary structure diagram the Noller lab produced for 16s - the rRNA for the small subunit (30s):

The pattern is created based on the natural nucleotide bonding patterns - stems and loops. The actual layout is curated to match known traits of the physical structure. So it’s known what’s a stem, what’s a loop, and where it is found. The artists who created these maps helped crystalize that into these images - from crystallograpy.

Here’s the rRNA, 23s, for the large subunit (50s):

Matching DNA to rRNA

By simply matching the DNA codon sequences for Ribosomal proteins, I found structural matches for ten attachment sites in E Coli, involving six Ribosome proteins: rplD, rplB, rpsD, rpsH, rpsE, rplC.

Then I found three more sites in the human rRNA 18s with the S4 encoding genes: rps4x and rpl4 genes matching sites similar to E Coli’s S4, but each was slightly longer.

For rps4x 1-13 and rpl4-11, they were both on similar shaped sites as E Coli. I checked rpl4@283 because I saw in Alpha Fold that it was a ribbon section, near rpl4 1-11. I found it right nearby - right next to rpl4’s 1-11 site in the secondary structure diagram. The patterns are subtle, flowing but the back-and-forth twisting patterns, long sequences of nearly perfect matches.

Some of the patterns look “improbably stretched” in the secondary structure diagram until you see that they are also at structural regions - the top of a loop, or the neck of a stem, one of those weird bumps on a loop. These are where a flowing bond makes sense structurally. These may be multiple simultaneous attachment points for flowing structures, alternate bonding sites based on fluid flow patterns, or redundant layers - built in so this site does not slip. Could it be a coincidence?

  • Two organisms, eleven genes - all the ones where I had confidence I knew where to look, found where expected. All sites with consecutive or nearly consecutive perfect matches between the protein’s codon and the rRNA. See the table below for specific codon index to rRNA letter index matches.

This is data that is widely available, well studied but no one thought to put it together. And yet there is such a high degree of correlation, clear structural patterns, that look designed and match what we know about the ribosome’s form and function.

Multiple proteins, sharing the same attachment sites, for redundancy, flowing assembly? Or maybe the rRNA has enough bonding energy for them all to hold them in a shape? It does keep getting recharged with energy from the ATP charged tRNA that continually connect up.

From the structural pictures, it feels like these are flowing bonds, adjusting shapes, backups for important attachments.

E Coli 23s Attachment Sites

The patterns are so consistent and structurally logical that they suggest an organizational principle we’ve been missing. Here’s what the binding sites look like one-by-one so you can see how the consecutive proteins map to the sites:

E Coli 16s Attachment Sites

H. Sapiens 18s - rps4X

H. Sapiens 18s - rp4L

Observations

  • These proteins are all found roughly where they are expected from crystallography. The patterns jump out when you have the sequences and are looking in the right place. When I’m looking in the wrong place, I can stare it for hours and not find anything.

  • The attachment sites correspond to regions of similar shapes when comparing them to Alpha Fold’s images: all relatively flat spots, strands, ribbons, not helices.

  • Each attachment site has a clear entrance and exit that makes sense structurally.

  • Attachment sites all are structural: on stems, loops, crossing small open sections (ponds) and between stems (bays).


  • Here’s the longest matching sequence - rpsH on 16s with 22 codons in a row with only one codon out of sequence, down one stem around a loop and down another using an opposite pattern to rplB on the same stem.

  • rplB has the most beautiful sequence: two bonds on a long stem. In the middle of the stem the rRNA has: G - U and U - G - the weak opposites next to each other in symmetry, creating a more flexible part. Right in the middle of this long stem. On top of that comes the protein next has CCA - opposites to G and G but A and U are attracting.

This creates a crazy flowing bond/repel action right where this “bendy” section of the long stem is. Follow that up with two more three letter matches, then cross a big loop and another three letter match. Further, rplB is on 16s, interacting on a similar stem with rplD. This create a chain of fluid reactions, coordinated by the fluid energy exchange of the tRNA.


  • Sites are shared by multiple proteins and for the two proteins rplD and rplB that were found on both 23s and 16s, the interaction patterns match on both sites. In particular on 23s, rplD crosses an open pond:

and here again on 16s.


For the more complex human genome, so far I only looked for one gene - the one for S4 - rps4x and found it in about an hour on 18s. It’s the most complex attachment site yet with 9 codons going up the stem and five coming back down.


and similar detail for rpl4 1-11.


and even more complex for rpl4 283-293. As predicted by alpha fold, next to the 1-11 site.


The top ‘C’ folds over into that small pond for two of these bonds.


Three layers of overlapping patterns going up and down the same stem.

All of the bonds are twisting patterns practically and the exit is a perfect UUU right in the center of a loop with complementary patterns.

Note on Notation

For convenience purposes, only the rRNA’s pattern is shown, i.e. the DNA’s codon’s anti-codon is shown. For each match, consecutive letters in the DNA match consecutive letters in the rRNA. Matches go in either direction, zig-zag, etc. but the order is preserved like how the amino-acids will be organized in the protein chain.

Nucleotide’s have these opposite patterns between DNA and RNA:

DNA <=> RNA
G <=> C
C <=> G
A => U
T => A
T ~= U

So rplD@1's first codon is ATG but the table shows the mirror UAC that will match directly to rRNA’s UAC at these letter positions 955,1224,1223.

Flow

Flow type of 888 means that each of the three DNA letters in the codon matches it’s opposite in the rRNA.

A big X indicates the perfect conflict where: G would meet C, or A would meet U. These are exit patterns. And to be clear, this is where the letter in the DNA Codon matches the rRNA letter - same spinning signatures repel other same-spinning signatures.

A small x for a letter indicates the non-perfect conflict - e.g. G repels U but less so than C, and A repels C but less so than U.

A small w for a letter indicates the weak match - e.g. G is friendly with A, U is friendly with C.

Stem Patterns

Stem patterns are numbered from the perspective of the start codon:

*1 . 2    2 . *1
3 . 4     4 .  3
5 . 6     6 .  5

5 . 6     6 .  5
3 . 4     4 .  3
*1 . 2    2 . *1

The Type column is an attempt to keep track of direction but it goes back and forth so that’s a work in progress.

E Coli 16s/23s Mapping Table

Protein@Codon rRNA Indices Type(WIP) Match Pattern rRNA Region rRNA code Flow Partners
rplD@1 955,1224,1223 flipped stem-124 16s 3’M UAC 888
rplD@2 1228,952,967 stem-bay-cross 16s 3’M CUU 888
rplD@3 958,959,960 normal loop 16s 3’M AAU 888
rplD@4 975,976 normal pond-cross 16s 3’M CAU/AG xx-
rplD@5 977,978,979 normal loop 16s 3’M AAC 888
rplD@6 980,981,982 normal loop 16s 3’M UUU/CUU w88 rpsE@6
rplD@7 985,986,987 normal stem-135 16s 3’M CUG 888
rplD@8 1218,1217 normal stem-13 16s 3’M CGC/CC 8x8 rpsH@6,…
rplD@9 988,989,990 normal stem-135 16s 3’M GUC 888 rpsH@5,…
rplD@10 pond-cross 16s 3’M GCU
rplD@11 1045,1046,1210 loop-stem-jump 16s 3’M CGC 888
rplD@12 1211,1209 exit bulge-exit 16s 3’M GAC XXX
—————
rplD@63 2076,2075,2074 flipped loop-stem 23s V UUU 888
rplD@64 2073,2072,2443 flipped stem-134 23s V CCG 888
rplD@65 2441,2439,2069 flipped stem-loop-cross 23s V UUG 888
rplD@66 2442,2443,2444 normal stem-135 23s V CCG 888
rplD@67 2445,2065 stem-loop-1x 23s V GCA 88x
rplD@68 2066,2067,2442 normal stem-loop-1x 23s V CGC 888
rplD@69 exit bay-exit 23s V CGC XXw
rplD@90 2091,2092,2093 normal loop-stem 23s V GUC 888
rplD@90 2091,2092,2093 normal loop-stem 23s V GUC 888
rplD@91 2196,2229,2228 pond-cross 23s V CUG 888 rplB@257,rplB@256
rplD@92 2221,2220 pond-cross-1x 23s V GUG -88
rplD@93 2219,2206,2205 stem-142 23s V UCA 888
rplD@94 2204,2203,2202 stem-bump 23s V GUU 888
rplD@95 exit bay-exit 23s V UUU xwX
—————
rplB@1 991,1216,1217 normal stem-124 16s 3’M UAC 888
rplB@2 1218,987,986 normal stem-124 16s 3’M CGU 888
rplB@3 988,990 normal stem-loop-1x 16s 3’M CAA 88-
rplB@4 959,958 normal loop-receive-1x 16s 3’M CAA -88
rplB@5 957,956,955 normal loop 16s 3’M UUU 888
rplB@6 1225,1226,1227 normal stem-with-bump 16s 3’M ACA 888
—————
rplB@252 2185,2186,2112 normal stem-136 23s V UGG 888
rplB@253 2189,2188,2187 normal stem-124 23s V UUU 888
rplB@254 2100,2099,2190 mid-stem-flex 23s V CCA/GUUG XwwW
rplB@255 2098,2192,2096 flipped stem-145 23s V UUC 888
rplB@256 2194,2195,2196 normal stem-135 23s V UUC 888 rplD@91
rplB@257 2229,2228,2230 loop-stem 23s V UGG 888 rplD@91
rplB@258 exit pond-exit 23s V GCA/UA xw–
—————
rpsD@1 948,947,946 flipped stem-135 16s 3’M UAC 888
rpsD@2 1230,1231,1232 normal stem-135 16s 3’M UCG 888
rpsD@3 1229 bay-cross 16s 3’M UCU
rpsD@4 1244,1243,1242 weak-opposite 16s 3’M TAT/GCG www
rpsD@5 1237,1238,1239 normal loop 16s 3’M CAA 888
rpsD@6 1240 pond-cross 16s 3’M CCA/A x–
rpsD@7 1339,1338,1337 flipped loop 16s 3’M GGA 888
rpsD@8 1340,1341,943 flipped neck-weak-opposite 16s 3’M UUC/A 8x8
rpsD@9 945,946,947 normal loop-stem 16s 3’M GAG 888
rpsD@10 948 exit mid-stem-exit 16s 3’M UUC/G x–
—————
rpsH@1 1040,1000,999 flipped stem-124 16s 3’M UAC 888 rpsE@1
rpsH@2 998,1043 stem-12 16s 3’M UCG -88 rpsE@2
rpsH@3 997,996,995 flipped loop 16s 3’M UAC 888
rpsH@4 993,992,991 flipped loop 16s 3’M GUU 888
rpsH@5 990,989,1216 flipped stem-134 16s 3’M CUA 888
rpsH@6 988,987,1218 flipped stem-134 16s 3’M CUA 888
rpsH@7 986,1219,1220 normal stem-124 16s 3’M UAG 888
rpsH@8 985,1221,984 flipped stem-124 16s 3’M CGC 888
rpsH@9 exit mid-pond-exit 16s 3’M CUA/AG xX-
—————
rpsE@1 1040,1000,999 flipped stem-124 16s 3’M UAC 888 rpsH@1
rpsE@2 998,1043,1042 normal stem-124 16s 3’M CGA 888 rpsH@2
rpsE@3 1041 pond-cross-2x 16s 3’M CUG/AA 8XX
rpsE@4 bay-cross-3x 16s 3’M UAG/ACA Xww
rpsE@5 980,981,982 normal loop 16s 3’M CUU 888
rpsE@6 bay-cross-3x 16s 3’M UUU rplD@6
rpsE@7 993,992,991 flipped loop 16s 3’M GUU 888 rplH@4
rpsE@8 990,1215,1216 normal stem-124 16s 3’M CGA 888 rplH@5, rplD@9
rpsE@9 1217,1218 normal stem-13x 16s 3’M CCG 88- rplH@6, rplD@8
rpsE@10 exit side-exit 16s 3’M CUU
—————
rplC@1 2440,2439,2438 flipped loop-stem 23s V UAC 888 rplD@65
rplC@2 2441,2070,2071 stem-124 23s V UAA 888 rplD@65
rplC@3 2072,2073,2435 stem-136 23s V CCA 888 rplD@64
rplC@4 2434,2433,2243 pond-cross 23s V AAU 888
rplC@5 2078,2077,2242 stem-132 23s V CAG 888
rplC@6 2247,2248 pond-cross 23s V CCA x88
rplC@7 2245,2244,2243 loop 23s V UUU 888
rplC@8 2074,2075,2076 loop 23s V UUU 888 rplD@63
—————
rplC@205 2249,2250,2251 loop-stem 23s V GGU 888
rplC@206 2248,2256,2247 stem-123 23s V CGA 888
rplC@207 2258,2247,2248 stem-135 23s V CAC 888 rplC@6
rplC@208 2257,2259,2260 cross-stem 23s V UUC 888
rplC@209(end) 2261,2279,2260 cross-stem 23s V CGC 888
+————— —————— ———— ——————– ————- ———- ———– ——————-+

H Sapiens

Ribosomal proteins S4 and L4 found in a similar shaped region as rpsD/rplD - the genes for S4/L4 in E Coli:

Protein@Codon rRNA Indices Type(WIP) Match Pattern rRNA Region rRNA code Flow Partners
rps4x@1 1043,1080,1044 stem-123 18s UAC 888
rps4x@2 1078,1079,1080 stem-135 18s CGA 888
rps4x@3 1042,1081,1041 stem-123 18s GCA 888
rps4x@4 stem-to-bay 18s CCA
rps4x@5 1046,1048,1049 stem-to-loop-jump 18s GGG 888
rps4x@6 1050,1051,1052 loop-to-one-stem 18s UUC 888
rps4x@7 1075,1076,1077 loop-to-stem 18s UUC 888
rps4x@8 1046,1076,1047 stem-143 18s GUA 888
rps4x@9 1074,1073,1052 loop-stem-twist 18s GAC 888
rps4x@10 1075,1076,1077 loop-to-stem 18s UUC 888 rps4x@7
rps4x@11 1046,1077,1078 stem-124 18s GCC 888
rps4x@12 1044,1080,1081 stem-126 18s CAC 888
rps4x@13 1082,1042,1043 loop-stem-twist 18s CGU 888
rps4x@14 1044,1079,1080 stem-124 18s CGA 888
rps4x@15 1042 exit exit-stem-base 18s GGU 8–
—————
rpl4@1 1043,1080,1044 stem-123 18s UAC 888 rps4x@1
rpl4@2 1078,1046,1077 stem-143 18s CGC 888
rpl4@3 1078,1046,1077 stem-143 18s ACA 88-
rpl4@4 1078,1079,1080 stem-135 18s CGA 888
rpl4@5 1079,1078,1046 stem-136 18s GCG 888
rpl4@6 1048,1049,1051 loop-with-skip 18s GGU 888
rpl4@7 1076,1075,1052 loop-stem-flip 18s GAC 888
rpl4@8 1077,1047,1078 loop-cross-stem 18s UAU 888
rpl4@9 1047,1046,1078 stem-136 18s AGC 888
rpl4@10 1044,1080,1081 stem-146 18s CAC 888
rpl4@11 1083,1084,1042 loop-stem-cross 18s AUG 888
rpl4@12 1041 exit bay-exit 18s AGG 8–
—————
rpl4@283 1119,1120,1121 loop 18s UUC 888
rpl4@284 1125,1124,1123 loop-to-stem 18s UAC 888
rpl4@285 1119,1124,1118 stem-123 18s UAA 888
rpl4@286 1117,1116,1127 stem-132 18s UUA 888
rpl4@287 1117,1126,1125 stem-124 18s UGU 888
rpl4@288 1121,1119,1124 fold-over-tip 18s CUA 888
rpl4@289 1118 flexing-fold 18s GAA -8-
rpl4@290 1116,1114,1115 stem-bump-flip 18s UCG 888
rpl4@291 1117,1128,1116 weird-stem-bend 18s UCU 888
rpl4@292 1125,1117,1126 stem-123 18s UAG 888
rpl4@293 1118,1124,1123 stem-to-loop 18s AAC 888
rpl4@294 1119,1120 loop-exit 18s UUU 88-
+————— —————— ———— ——————– ————- ———- ———– ——————-+

Alpha Fold Images - E Coli - rplD

For the three sites, notice that all matches occur in strand, or just at the start of a ribbon section. The shapes roughly match up though Alpha Fold’s accuracy is limited in precision by the resolution of crystalography techniques used to train it.

Conclusions

There is too much pattern here for this to be random. These patterns show how the ribosome proteins tie together the two rRNA skeletons, how the flowing rhythmic behavior is formed using patterned bonds that are designed to flex and contract. This is how each tRNA + amino acid gets drawn in, and advanced from stage to stage. I’m not an expert in this field but I believe this is consistent with existing science and adds a layer of understanding.

My personal takeaways include:

  • The bonding pattern for amino acids in a protein chain are influenced by the codon pattern.
  • An amino acid’s codon pattern affects its bonding so it will bond with an opposite pattern using figure 8 attraction form opposite orbital signature patterns.
  • If dark matter orbital signatures are the forces behind covalent bonds, they explain why codons can transmit patterns: dark orbital signatures have state, memory and programmable with fluid flow.
  • Cellular dynamics are controlled by fluid flow systems - ordered, minimizing turbulence, not entropy and chaos.
  • The tRNA acts like a noise-cancelling amplifier that transfers the codon’s orbital signature to its amino acid.
  • When the tRNA attaches to it’s corresponding mRNA, the energy from the ATP is flows into the rRNA, expanding the skeleton, the proteins manage the timing and contraction phase after that energy stabilizes. The spacing, and timing patterns are built into the loops of the rRNA and held together by the proteins of the ribosome. The exact form, shape, stretchiness, contraction pattern is complex but you can see it from the fluid interactions of these connections and interconnections.
  • Lack of flow in this system will stand out… cause turbulence. So incorrect patterns, fluid flow, disharmony will shuts it down. Fluid flow drives the whole process. For example, the leader and follower sections have very different flow patterns so snRNPs that help cut out non-coding sections could be flow control devices.
  • ATP’s energy is released but conserved. Dark matter fluid is designed to preserve it’s spinning nature, minimize disruption, preserve the spinning, finding spinning/counter-spinning flows that work at a lower energy state, i.e. less turbulence. Otherwise, it spins indefinitely (or for our universe’s lifecycle). To eliminate the turbulence with spinning and counter-spinning boundary layers. Cells are built on that same paradigm.
  • Ribosomal proteins maintain the shape and flow, ensure the right contact points are met. The rRNA is the main fluid flow channeller. It’s shape can either be seen as a fluid flow system, with circular eddies forming in the ponds and canyons in the bays, or as a landing strip for arriving proteins, guided in on fluid flow till they find the right stem connections. So the shape of the rRNA is both defining how it gets assembled, how it expands as the next tRNA arrives, how proteins help it contract with the right timing and shape.

This discovery opens up a whole new level of understanding how things work.

Next Steps

Code these patterns into the genetic analyzer so it can detect back-to-back 2 codon in a row sequences.

Find all RNA that lives in each region of each cell and automate the process of mapping RNA to Proteins. How many proteins use RNA as a skeleton or skin?

Code the genetic analyzer to find 2+ long bonding patterns between proteins using it’s partners mechanism.


More Info:

The genetic analyzer - a tool that analyzes proteins preserving the codon information.

Fluid DNA - More info on how the process works with fluid dynamics.