SQL Programming

SQL introduction

SQL (Structured Query Language) is a database computer language created for managing data in Relational Database Management Systems (RDBMS). It was originally developed at IBM by Donald D. Chamberlin and Raymond F. Boyce in the early 1970s, initially called SEQUEL, in order to manipulate and retrieve data stored in IBM's original relational database management system.

The SQL language is used to operate databases including database creation, data query, update, schema creation and modification as well as data access control. It is an ANSI (American National Standards Institute) standard language with different versions for Relational Database System. The RDBMS (such as MySQL, MS SQL Server, IBM DB2, Sybase, PostgreSQL, SQLite, Oracle, MS Access) use SQL as their standard database language.

Basic SQL statements


					SELECT c1,c2,...;  --SELECT some fields FROM table1 WHERE (a clause to filter records)...
					FROM table1;
					WHERE condition;
					GROUP by column-name;
					ORDER by column-name;

					SELECT * FROM table1;  --SELECT all fields FROM table1...

					SELECT column_name(s) FROM table1
					INNER JOIN table2 ON table1.column_name = table2.column_name;  --select records with matching values in both tables.
					LEFT JOIN table2 ON table1.column_name = table2.column_name;
					--generate all records from the left table (table1) & the matched records from the right table (table2).
					RIGHT JOIN table2 ON table1.column_name = table2.column_name;
					--generate all records from the right table (table2) & the matched records from the left table (table1).
					FULL OUTER JOIN table2 ON table1.column_name = table2.column_name;
					--generate all records if there is a match in either left (table1) or right (table2) table records.

					SELECT column_name(s) FROM table1 T1, table1 T2  --The table is joined with itself.
					WHERE condition;
					

SQL vs NoSQL

The database technology has two main types of databases including SQL (structured query language) and NoSQL (“non SQL" or “Not only SQL"), which refer to relational databases and non-relational databases, respectively. Unlike SQL, NoSQL databases are document-oriented instead of tables. Thus, the unstructured data (such as articles, photos, videos, and web content etc.) are stored in a single document that is different from SQL databases (SQL & NoSQL Databases, 2019).

NoSQL databases have an advantage of ease of access by the app developers while relational databases are not easily accessible to the applications developed using the object-oriented programming languages such as Python and PHP (Hypertext Preprocessor).



SQL Applications

Creating and managing databases


					CREATE TABLE geneList(
					    id INT NOT NULL AUTO_INCREMENT,
					    geneName VARCHAR(255),
					    species VARCHAR(255),
					    trait VARCHAR(255),
					    geneSeq VARCHAR(255),
					    description VARCHAR(255),
					    fPrimer VARCHAR(255),
					    rPrimer VARCHAR(255),
					    ampliconSize VARCHAR(255),
					    PRIMARY KEY(id)
					);
					--Insert data
					INSERT INTO genelist(geneName, species, trait, geneSeq, description, fPrimer, rPrimer, ampliconSize) VALUES
					('G1.01gxxxxxx','species1','YIELD','TGGTAACTCG....ACTGATCGGC','DESCRIPTION1','ATAGATTATTTAGAGCCAGC','GCTGGCGTCCACGCATTTAC','118'),
					--more data
					('G8.06gxxxxxx','species1','YIELD','CGGACCAGCG....CCAGTGATCC','DESCRIPTION8','GCTTAGATAGCGAGCGACCG','AGGTGTGCGACCGCAGTGCA','113');
					

The top rows of the gene table (genelist) were exported from the database created using SQL scripts mentioned above:
genelist

This table can be easily accessed/queried using SQL statements such as "SELECT * FROM genelist, WHERE condition(id=...)", etc.:


					SELECT
					    *
					FROM
					    genelist
					WHERE
					    geneName like '%8g%';   -- select all the genes on linkage group 8
					    trait like '%disease%'  -- select all the disease resistance genes
					

Or, it can be updated using the following SQL statements such as:

  • UPDATE genelist, SET column1=..., column2=..., WHERE id=...;
  • DELETE FROM genelist, WHERE id=...;
  • INSERT INTO genelist (column1, column2, ...), VALUES (VALUE1, VALUE2, VALUE3, ...);
  • ALTER TABLE genelist, ADD column_name datatype;