Among the significant health problems of the world today, allergic diseases are not to be excluded and it affect both people in developed and developing countries. Allergic diseases are caused by different allergens and resulting in allergic diseases including asthma, rhinitis, food allergies, and atopic dermatitis. The incidence of allergic disease is increasing worldwide with a great epidemic proportion in association to environmental exposure to allergens and with modern lifestyles. TSLP is a master switch at the interface between the environmental allergens and the pulmonary allergic immunologic responses, and plays a central role in polarizing dendritic cells (DCs) by enhancing OX40L expression, which induces the differentiation of naive T cells into Th2 cells.1-3
Thymic stromal lymphopoietin (TSLP) is related to IL-7 and is an epithelial-derived cytokine significantly elevated in those with asthma and allergic diseases. TSLP is expressed by epithelial cells of the skin, gut and lung and primes resident dendritic cells to promote Th2 cytokine production by their subsequently engaged effector T cells. Allergens stimulate the production of TSLP in inflamed tissue and their receptors primarily expressed by dendritic cells are expressed on mast cells and also promotes allergic responses. TSLP is also produced at barrier surfaces, fibroblasts, mast cells, and keratinocytes.4,5 Expression of TSLP when proteases interact with PAR2 contribute to allergic inflammation,5 this indicate that TSLP might have a function in early activation of the innate host defense response at the initial site of exposure in the epithelium, leading to Th2 polarization.5
Structurally, TSLP resembles IL-7, consisting of a four-helix bundle cytokine. Despite poor amino acid sequence identity between murine and human TSLP they have similarity at the functional level.6,7 Therefore the murine TSLP with only 40% identity with human TSLP was used as a template in this study to build the 3-Dimensional structure of human TSLP to develop drug that can inhibit it function because of its important in the initiation of allergic inflammation. Allergic diseases has become complicated disease involving several cells and increase rate of recurrence it is therefore necessary to develop new drugs that can potentially treat these diseases. Hence, drugs discovered from virtual screening of large natural database using computational methods by targeting this potential therapeutic target (Thymic stromal lymphopoietin) will help in the treatment of this disease at minimum time, low cost and free side effects. This will definitely reduce the rate of recurrence of this disease because of the promiscuous effectiveness of natural compounds. This study focuses on determining the 3D structure of TSLP and inhibiting it function using natural small molecules by molecular docking methods for the development of anti-allergic drugs.
MATERIALS AND METHODS
Sequence Retrieval, Template Selection and Secondary Structure Prediction
The amino sequence of TSLP (159 residues) was retrieved and saved in FASTA format for comparative modeling from National Center for Biotechnology Information (NCBI). The sequence (NP_149024.1) was then subjected to protein-protein Basic Local Alignment Search Tool (BLASTp) against the Protein Data Bank for search of suitable template. Chrystal structure of chain A Cytokine receptor from Mus musculus (mouse) with PDB ID 4NN5, sequence Identity of 40%, E-value of 3e-23, Query coverage of 77% and resolution of 1.9 Å was selected as a template. The amino sequence of the target sequence (TSLP) was analyzed by Self-optimized Prediction Method with Alignment (SOPMA).8 further analysis of the template (4NN5) structure was performed in UCSF Chimera.9
Homology Modeling and Evaluation of Models
During the course of this study the human 3-Dimensional structure of Thymic Stromal Lymphopoetin (TSLP) has not been determined by either X-ray crystallography or Nuclear Magnetic Resonance methods. To understand the binding of small molecules to TSLP for drug design, it is therefore apparent to determine its 3-Dimensional structure thereby we apply homology modeling approaches to determine the 3D structure. The alignment of TSLP and the template sequences was generated using the Align2d command (it takes into account structural information from the template when constructing an alignment) in MODELLER 9v16.10 once a target-template alignment is constructed, using auto model class MODELLER calculates a 3D model of the target automatically. Five similar models of TSLP were generated based on the template structure (4NN5) and the alignment file. Among the five 3D structures developed, the one with low Discrete Optimized Protein Energy (DOPE) was chosen and evaluated using PROCHECK 11 and ERRAT 12 online servers. Observed loops from superimposition (TSLP model vs. 4NN5) and unconstructed loops comparative to the non-corresponding structures of the template were refined by running the loop modeling script in MODELLER. The generated loop optimized model was further evaluated using PROCHECK and ERRAT.
Molecular Docking based Virtual Screening
Virtual screening based on molecular docking is one of the most widely used methods of structure based drug design. This approach provides molecular information about protein-ligand interactions and thus is important for lead discovery and optimization. To identify novel hits natural compounds we therefore screened freely accessible large database; ZINC Natural Product Database. The docking processes were performed in Autodock 4.2.13 Polar hydrogens were added and Lamarckian algorithm was applied. A grid of 60, 60, and 60 points in X (4.348), Y (7.451), and Z (-5.352) directions defined by the Autodock program was built with a grid spacing of 0.375Å. All other parameters were remained as default. The ones with the lowest binding energy were retained as hits compounds and further subjected to pharmacokinetics and toxicity evaluations.
Pharmacokinetics and Toxicity analysis of the Hit Compounds
The SMILES of the set of compounds with lowest binding energy were submitted to SwissADME 14 online server for study of their pharmacokinetics and drug-likeness properties. Molecules were filtered out based on 2 molecular properties which are Water Partition Coefficient (WlogP) values to be less than 5 which predict low level of toxicity, non-specific binding and possible oral administration,15 Topological Polar Surface Area (TPSA) less than 140 Å2 indicating a high possibility of complete absorption.16 Based on these criteria ligands that form good interaction with the protein were selected as hits. Further the hits ligands were analyzed for Bioavailability property using Boiled Egg analysis 17 and toxicity analysis using Osiris Datawarrior.18 Protein-ligand interactions in 2D diagrams, bond distance and acceptor angle were analyzed in Schrodinger suite.19 The atoms involved in the protein, and hit ligands, were discovered using PyMOL 20 software.
Secondary Structure Prediction
The prediction result has shown that the secondary structure of TSLP has been dominated with alpha helix 44.65% (71 residues) followed by random coils 30.19% (48 residues), extended strand 20.13% (32 residues) and Beta turn 9.23% ( 12 residues). Further analysis of the template structure (Figure 3A) in UCSF chimera indicates non corresponding structures in some residues of the crystal structure of the chain A of 4NN5 (cytokine receptor from Mus musculus).
Homology Modeling and Evaluations
The basic modeling of the TSLP resulted in five models and the one with lowest DOPE score of -13572.77344 and Molpdf of 1952.70386 was chosen. The model was further evaluated using Ramachandran plot (Figure 1) with 86.4% in Most favored regions [A,B,L] and 49.219 % of ERRAT results, (Figure 2) from the results important loops at residues number 108 to 126, 38 to 45 and 57 to 68 were observed. The loops were further optimized by running the loop modeling script in MODELLER 9.16 and the resulted optimized model with Molpdf of 2971.45166 was also evaluated and significant change in the Ramachandran plot statistics and ERRAT outcome ware observed (Table 1). The structural superimposition Cα trace of predicted model (TSLP) and template (4NN5) has shown RMSD of 3.977 (Figure 3C). Further, the results of the match-alignment of predicted TSLP 3D structure and chain A of template 4NN5 in UCSF Chimera has shown the missing residues and the loops as observed in Figure 4.
|PROCHECK||Basic Model||Loop Optimized Model|
|Most favoured regions [A,B,L]||86.4%||91.2%|
|Additional regions allowed [a,b,l,p]||8.8%||6.1%|
|Generously allowed regions [~a,~b,~l,~p]||0.7%||1.4%|
|Disallowed regions [XX]||2.0%||1.4%|
|ERRAT (overall quality)||49.219%||86.4%|
Molecular Docking and Pharmacokinetic properties of the hits ligands
Several molecules bind to the predicted 3D structure of TSLP, among which 58 molecules with lowest binding affinity were filtered out and evaluated for pharmacokinetic and physicochemical properties using SwissADME server. Six molecules (Table 2) were selected as top hits by passing the filtering criteria of Wlogp < 5 and TPSA < 140 Å² and were further studied for their bonding interaction (Table 3) with the target protein. Boiled egg model is proposed as an accurate predictive model that works by computing the lipophilicity and polarity of small molecules.21 The Boiled egg analysis of the six molecules (Figure 5) has shown all to be highly absorbable in the gastrointestinal tract.
Protein Ligand Interactions
The protein ligand interactions is specific because different side chains formed different bonds. ZINC19370008, ZINC19309311, ZINC00718292 and ZINC15959260 form hydrogen bonds with amides (Asn85 Gln80) residues of TSLP, they are polar due to Oxygen and Nitrogen drawing electrons towards them and tend to be better in hydrogen bonding. The aliphatic (Leu106) and the aromatic (Tyr29 & Trp109) amino acids are essentially non-polar and therefore interact mainly via hrdrophobic interactions and vander waals forces, therefore the interaction between ZINC20760321 and ZINC98368471 with TSLP residues is hydrophobic interaction.
The non-structural residues of the template 4NN5 correspond to the loop of the TSLP from residues 108 to 126 that were further modeled and form a good TSLP 3D-structure suitable for further docking studies. Two molecules (ZINC19370008 and ZINC19309311) cross the blood brain barrier, ZINC98368471 and ZINC00718292 are non-substrate to P-gp. It is of great importance in drug design to screened compounds that are non-substrate to P-gp at early stage to avoid drug-drug interactions. The physicochemical properties such as solubility and lipophilicity play a significant role of whether a drug can progress to be a successful drug candidate21 therefore determination of these parameters at early stage of drug discovery is necessary. Understanding the protein ligand interactions is very important because many drugs bind selectively to target proteins in the body. All the six molecules bind (Table 3) to the predicted model of TSLP with good binding anergy and form protein ligand interactions (Figure 6). The molecule ZINC98368471 form a pi-cation interactions with the protein which is a strong, non-covalent binding force that is used throughout nature. Tyrosine (Tyr), and Tryptophan (Trp) are generally hydrophobic and can contribute hydrogen bonds, the pi-cation interactions between amino acids contribute significantly to stabilizing protein secondary structure. In addition, drug-receptor interactions across a wide array of systems use pi-cation interactions.22 Therefore from the six molecules that bind to the predicted TSLP, ZINC98368471 with -9.55 binding affinity has shown to be the lead because it has good interaction with the protein and good pharmacokinetics properties. It has high gastrointestinal absorption, CYP2D6 inhibition and non-substrate to P-glycoprotein. Although from the toxicity analysis ZINC19309311 has low irritant character but it is predicted to be non-mutagenic, non-tumorigenic and has zero reproductive effect (Table 4).
This work predict the 3D structure of Thymic Stromal Lymphopoietin; an important therapeutic target for allergy using comparative modelling methods. Six ligands were identified with good binding energy by structure based drug design and molecular docking approaches and later optimized to have good gastrointestinal absorption and non-toxic. Taken together, this study demonstrate that these six ligands can further be tested under in-vivo and in-vitro condition for prediction of new drugs against allergic diseases.