Results and discussion Dataset processing Prediction of open STI571 reading frames (ORFs) from the dataset of 124 patients presented in  revealed an average of 203,300 potential ORFs per sample. Use of BLAST
sequence matching resulted in predicted protein functions for, on average, 46% of the ORFs per sample. Subsequent characterisation of these putative protein sequence fragments using the KEGG database allowed for metabolic classification of 39% of the ORFs with BLAST hits (18% of the original predicted ORF set). Each microbiome sample had an average of 2,400 KO groupings containing at least one sequence fragment with a total of 4,849 KOs being present in at least one sample in the dataset. Distributions of predicted metabolic functions between low and high-BMI groups Sequence SGC-CBP30 counts for all 4,849 KOs were compared across patients in order to identify metabolic functions that differ in abundance between low BMI (18 to 22) and high BMI (30+) associated samples. Present KEGG Orthology groups ranged in relative abundance from 4 × 10-5 (i.e. one copy of the protein in the largest
sample) to 0.8% of the total assigned proteins, Thiazovivin with K06147 (bacterial ATP-binding cassette, subfamily B) as the most abundant KO across all patients, regardless of BMI. Fifty-two KOs were found to differ significantly (Bonferroni-corrected p value <0.01) in abundance levels between lean- and obese-related samples. The majority of these KOs were low in frequency in both BMI categories; apart from the ABC transporter mentioned
above, only five of the 52 KOs had a mean proportion in both BMI sets of 0.2% or higher (Figure 1). K06147, in addition to being the most abundant protein in all patients, was 46% more abundant in low-BMI samples. The other four KOs that were found to have significant differences oxyclozanide in abundances all belong to the peptides/nickel transport system module (KEGG module M00239). This module contains five ABC transporter proteins (K02031-K02035), four of which were found to be significantly more abundant in low-BMI patients (K02031-K02034; ratios ranging between 42 and 44%; corrected p-values < 0.01) (Figure 1). This transport system contains two ATP-binding proteins (K02031 and K02032), two permeases (K02033 and K02034) and one substrate-binding protein (K02035). Variation in abundances of each KO between patients in the same BMI group (lean or obese) was found to be low, with mean proportions at most 0.2%. Although differences in abundance of K02035 were not found to be as statistically supported as the other subunits (p-value 0.021) it was found at similar levels of abundance between patients as the other four members of the transport system. Thus K02035 was included alongside the other subunits in the module in order to identify if specific species are associated with the complex as a whole.