Main Content

select

Execute SQL SELECT statement and import data into MATLAB

Description

example

data = select(conn,selectquery) returns imported data from the database connection conn for the specified SQL SELECT statement selectquery.

example

data = select(conn,selectquery,Name,Value) specifies additional options using one or more name-value pair arguments. For example, 'MaxRows',10 sets the maximum number of rows to return to 10 rows.

example

[data,metadata] = select(___) returns information about the imported data using any of the input argument combinations in the previous syntaxes. Use this information to change missing values in the imported data and view data types for each variable.

Examples

collapse all

Import data from a database in one step using the select function. You can access data and perform immediate data analysis.

The code assumes that you have a database table Patients stored on a Microsoft® SQL Server® database. This table contains patient data in 10 columns and rows. The table definition is:

CREATE TABLE Patients(
      LastName VARCHAR(50),
      Gender VARCHAR(10),
      Age TINYINT,
      Location VARCHAR(300),
      Height SMALLINT,
      Weight SMALLINT,
      Smoker BIT,
      Systolic FLOAT,
      Diastolic NUMERIC,
      SelfAssessedHealthStatus VARCHAR(20))

This example uses a Microsoft® SQL Server® Version 11.00.2100 database and the Microsoft® SQL Server® Driver 11.00.5058.

Create a database connection to a Microsoft® SQL Server® database with Windows® authentication. Specify a blank user name and password.

datasource = 'MS SQL Server Auth';
conn = database(datasource,'','');

Import all data from the Patients table by executing the SQL SELECT statement using the select function. data is a table that contains the imported data.

selectquery = 'SELECT * FROM Patients';

data = select(conn,selectquery)
data =

  10×10 table

     LastName      Gender     Age             Location             Height    Weight    Smoker    Systolic    Diastolic    SelfAssessedHealthStatus
    __________    ________    ___    __________________________    ______    ______    ______    ________    _________    ________________________

    'Smith'       'Male'      38     'Country General Hospital'    -32768       176    true      124          93          'Excellent'             
    'Johnson'     'Male'      43     'VA Hospital'                     69       163    false     NaN          77          'Fair'                  
    'Williams'    'Female'    38     ''                                64       131    false     125          83          'Good'                  
    'Jones'       'Female'     0     'VA Hospital'                     67       133    false     117          75          'Fair'                  
    'Broen'       'Female'    49     'Country General Hospital'        64       119    false     122          80          'Good'                  
    'Davis'       'Female'    46     'St Mary's Medical Center'        68       142    false     121         NaN          'Good'                  
    'Miller'      'Female'    33     'VA Hospital'                     64       142    true      130          88          'Good'                  
    'Wilson'      'Male'      40     'VA Hospital'                 -32768       180    false     115          82          'Good'                  
    'Moore'       'Male'      28     'St Mary's Medical Center'        68    -32768    false     115          78          'Excellent'             
    'Taylor'      'Female'    31     'Country General Hospital'        68       132    false     NaN          86          'Excellent'             

Determine the number of male patients by immediately accessing the data. Use the count function to find occurrences in the gender data of the character vector that represents a male. Determine the total number of occurrences.

males = count(data.Gender,'Male');
sum(males)
ans =

     4

Close the database connection.

close(conn)

Import a limited number of rows from a database in one step using the select function. Database Toolbox™ imports the data using MATLAB® numeric data types that correspond to data types in the database table. After importing data, you can access data and perform immediate data analysis.

The code assumes that you have a database table Patients stored on a Microsoft® SQL Server® database. This table contains patient data in 10 columns and rows. The table definition is:

CREATE TABLE Patients(
      LastName VARCHAR(50),
      Gender VARCHAR(10),
      Age TINYINT,
      Location VARCHAR(300),
      Height SMALLINT,
      Weight SMALLINT,
      Smoker BIT,
      Systolic FLOAT,
      Diastolic NUMERIC,
      SelfAssessedHealthStatus VARCHAR(20))

Here, connect to a Microsoft® SQL Server® Version 11.00.2100 database using the Microsoft® SQL Server® Driver 11.00.5058.

Create a database connection to a Microsoft® SQL Server® database with Windows® authentication. Specify a blank user name and password.

datasource = 'MS SQL Server Auth';
conn = database(datasource,'','');

Import data from the Patients table by executing the SQL SELECT statement using the select function. Limit the number of imported rows using the name-value pair argument 'MaxRows'.

data is a table. The MATLAB® data types in the table correspond to the data types in the database. Here, Age has data type uint8 that corresponds to TINYINT in the table definition.

metadata is a table that contains additional information about each variable in data.

  • VariableType -- MATLAB® data type

  • MissingValue -- NULL value representation

  • MissingRows -- Vector of row indices that contain a missing value

selectquery = 'SELECT * FROM Patients';

[data,metadata] = select(conn,selectquery,'MaxRows',5)
data =

  5×10 table

     LastName      Gender     Age             Location             Height    Weight    Smoker    Systolic    Diastolic    SelfAssessedHealthStatus
    __________    ________    ___    __________________________    ______    ______    ______    ________    _________    ________________________

    'Smith'       'Male'      38     'Country General Hospital'    -32768    176       true      124         93           'Excellent'             
    'Johnson'     'Male'      43     'VA Hospital'                     69    163       false     NaN         77           'Fair'                  
    'Williams'    'Female'    38     ''                                64    131       false     125         83           'Good'                  
    'Jones'       'Female'     0     'VA Hospital'                     67    133       false     117         75           'Fair'                  
    'Broen'       'Female'    49     'Country General Hospital'        64    119       false     122         80           'Good'                  


metadata =

  10×3 table

                                VariableType    MissingValue    MissingRows 
                                ____________    ____________    ____________

    LastName                    'char'          ''              [0×1 double]
    Gender                      'char'          ''              [0×1 double]
    Age                         'uint8'         [     0]        [         4]
    Location                    'char'          ''              [0×1 double]
    Height                      'int16'         [-32768]        [         1]
    Weight                      'int16'         [-32768]        [0×1 double]
    Smoker                      'logical'       [     0]        [0×1 double]
    Systolic                    'single'        [   NaN]        [         2]
    Diastolic                   'double'        [   NaN]        [0×1 double]
    SelfAssessedHealthStatus    'char'          ''              [0×1 double]

Determine the number of male patients by immediately accessing the data. Use the count function to find occurrences in the gender data of the character vector that represents a male. Determine the total number of occurrences.

males = count(data.Gender,'Male');
sum(males)
ans =

     2

Close the database connection.

close(conn)

Import data from a database in one step using the select function. Database Toolbox™ imports the data using MATLAB® numeric data types that correspond to data types in the database table. You can view data type information in the imported data. You can also access data and perform immediate data analysis.

The code assumes that you have a database table Patients stored on a Microsoft® SQL Server® database. This table contains patient data in 10 columns and rows. The table definition is:

CREATE TABLE Patients(
      LastName VARCHAR(50),
      Gender VARCHAR(10),
      Age TINYINT,
      Location VARCHAR(300),
      Height SMALLINT,
      Weight SMALLINT,
      Smoker BIT,
      Systolic FLOAT,
      Diastolic NUMERIC,
      SelfAssessedHealthStatus VARCHAR(20))

Here, connect to a Microsoft® SQL Server® Version 11.00.2100 database using the Microsoft® SQL Server® Driver 11.00.5058.

Create a database connection to a Microsoft® SQL Server® database with Windows® authentication. Specify a blank user name and password.

datasource = 'MS SQL Server Auth';
conn = database(datasource,'','');

Import all data from the Patients table by executing the SQL SELECT statement using the select function.

data is a table. The MATLAB® data types in the table correspond to the data types in the database. Here, Age has the MATLAB® data type uint8 that corresponds to TINYINT in the table definition.

metadata is a table that contains additional information about each variable in data.

  • VariableType -- MATLAB® data type

  • MissingValue -- Null value representation

  • MissingRows -- Vector of row indices that contain a missing value

selectquery = 'SELECT * FROM Patients';

[data,metadata] = select(conn,selectquery)
data =

  10×10 table

     LastName      Gender     Age             Location             Height    Weight    Smoker    Systolic    Diastolic    SelfAssessedHealthStatus
    __________    ________    ___    __________________________    ______    ______    ______    ________    _________    ________________________

    'Smith'       'Male'      38     'Country General Hospital'    -32768       176    true      124          93          'Excellent'             
    'Johnson'     'Male'      43     'VA Hospital'                     69       163    false     NaN          77          'Fair'                  
    'Williams'    'Female'    38     ''                                64       131    false     125          83          'Good'                  
    'Jones'       'Female'     0     'VA Hospital'                     67       133    false     117          75          'Fair'                  
    'Broen'       'Female'    49     'Country General Hospital'        64       119    false     122          80          'Good'                  
    'Davis'       'Female'    46     'St Mary's Medical Center'        68       142    false     121         NaN          'Good'                  
    'Miller'      'Female'    33     'VA Hospital'                     64       142    true      130          88          'Good'                  
    'Wilson'      'Male'      40     'VA Hospital'                 -32768       180    false     115          82          'Good'                  
    'Moore'       'Male'      28     'St Mary's Medical Center'        68    -32768    false     115          78          'Excellent'             
    'Taylor'      'Female'    31     'Country General Hospital'        68       132    false     NaN          86          'Excellent'             


metadata =

  10×3 table

                                VariableType    MissingValue    MissingRows 
                                ____________    ____________    ____________

    LastName                    'char'          ''              [0×1 double]
    Gender                      'char'          ''              [0×1 double]
    Age                         'uint8'         [     0]        [         4]
    Location                    'char'          ''              [0×1 double]
    Height                      'int16'         [-32768]        [2×1 double]
    Weight                      'int16'         [-32768]        [         9]
    Smoker                      'logical'       [     0]        [0×1 double]
    Systolic                    'single'        [   NaN]        [2×1 double]
    Diastolic                   'double'        [   NaN]        [         6]
    SelfAssessedHealthStatus    'char'          ''              [0×1 double]

View data types of each variable in the table.

metadata.VariableType
ans =

  10×1 cell array

    'char'
    'char'
    'uint8'
    'char'
    'int16'
    'int16'
    'logical'
    'single'
    'double'
    'char'

Determine the number of male patients by immediately accessing the data. Use the count function to find occurrences in the gender data of the character vector that represents a male. Determine the total number of occurrences.

males = count(data.Gender,'Male');
sum(males)
ans =

     4

Close the database connection.

close(conn)

Import data from a database in one step using the select function. During import, the select function sets default values for missing data in each row. Use the information about the imported data to change the default values.

The code assumes that you have a database table Patients stored on a Microsoft® SQL Server® database. This table contains patient data in 10 columns and rows. The table definition is:

CREATE TABLE Patients(
      LastName VARCHAR(50),
      Gender VARCHAR(10),
      Age TINYINT,
      Location VARCHAR(300),
      Height SMALLINT,
      Weight SMALLINT,
      Smoker BIT,
      Systolic FLOAT,
      Diastolic NUMERIC,
      SelfAssessedHealthStatus VARCHAR(20))

Here, connect to a Microsoft® SQL Server® Version 11.00.2100 database using the Microsoft® SQL Server® Driver 11.00.5058.

Create a database connection to a Microsoft® SQL Server® database with Windows® authentication. Specify a blank user name and password.

datasource = 'MS SQL Server Auth';
conn = database(datasource,'','');

Import all data from the Patients table by executing the SQL SELECT statement using the select function.

data is a table that contains the imported data.

metadata is a table that contains additional information about each variable in data.

  • VariableType -- MATLAB® data type

  • MissingValue -- NULL value representation

  • MissingRows -- Vector of row indices that indicate the location of missing values

selectquery = 'SELECT * FROM Patients';

[data,metadata] = select(conn,selectquery)
data =

  10×10 table array

     LastName      Gender     Age             Location             Height    Weight    Smoker    Systolic    Diastolic    SelfAssessedHealthStatus
    __________    ________    ___    __________________________    ______    ______    ______    ________    _________    ________________________

    'Smith'       'Male'      38     'Country General Hospital'    -32768       176    true      124          93          'Excellent'             
    'Johnson'     'Male'      43     'VA Hospital'                     69       163    false     NaN          77          'Fair'                  
    'Williams'    'Female'    38     ''                                64       131    false     125          83          'Good'                  
    'Jones'       'Female'     0     'VA Hospital'                     67       133    false     117          75          'Fair'                  
    'Broen'       'Female'    49     'Country General Hospital'        64       119    false     122          80          'Good'                  
    'Davis'       'Female'    46     'St Mary's Medical Center'        68       142    false     121         NaN          'Good'                  
    'Miller'      'Female'    33     'VA Hospital'                     64       142    true      130          88          'Good'                  
    'Wilson'      'Male'      40     'VA Hospital'                 -32768       180    false     115          82          'Good'                  
    'Moore'       'Male'      28     'St Mary's Medical Center'        68    -32768    false     115          78          'Excellent'             
    'Taylor'      'Female'    31     'Country General Hospital'        68       132    false     NaN          86          'Excellent'             


metadata =

  10×3 table array

                                VariableType    MissingValue    MissingRows 
                                ____________    ____________    ____________

    LastName                    'char'          ''              [0×1 double]
    Gender                      'char'          ''              [0×1 double]
    Age                         'uint8'         [     0]        [         4]
    Location                    'char'          ''              [0×1 double]
    Height                      'int16'         [-32768]        [2×1 double]
    Weight                      'int16'         [-32768]        [         9]
    Smoker                      'logical'       [     0]        [0×1 double]
    Systolic                    'single'        [   NaN]        [2×1 double]
    Diastolic                   'double'        [   NaN]        [         6]
    SelfAssessedHealthStatus    'char'          ''              [0×1 double]

Retrieve indices that indicate the location of missing values in the Height variable using the metadata output argument.

values = metadata.MissingRows{'Height'}
values =

     1
     8

Change the default value for missing data from -32768 to 0 using a for loop. Access the imported data using the indices.

for i = 1:length(values)
    data.Height(values(i)) = 0;
end

View the imported data.

data.Height
ans =

  10×1 int16 column vector

    0
   69
   64
   67
   64
   68
   64
    0
   68
   68

Missing values appear as 0.

Close the database connection.

close(conn)

Import data from a database in one step using the select function. During import, the select function sets default values for missing data in each row. Use the information about the imported data to change default values by indexing into the vector.

The code assumes that you have a database table Patients stored on a Microsoft® SQL Server® database. This table contains patient data in 10 columns and rows. The table definition is:

CREATE TABLE Patients(
      LastName VARCHAR(50),
      Gender VARCHAR(10),
      Age TINYINT,
      Location VARCHAR(300),
      Height SMALLINT,
      Weight SMALLINT,
      Smoker BIT,
      Systolic FLOAT,
      Diastolic NUMERIC,
      SelfAssessedHealthStatus VARCHAR(20))

Here, connect to a Microsoft® SQL Server® Version 11.00.2100 database using the Microsoft® SQL Server® Driver 11.00.5058.

Create a database connection to a Microsoft® SQL Server® database with Windows® authentication. Specify a blank user name and password.

datasource = 'MS SQL Server Auth';
conn = database(datasource,'','');

Import all data from the Patients table by executing the SQL SELECT statement using the select function.

data is a table that contains the imported data.

metadata is a table that contains additional information about each variable in data.

  • VariableType -- MATLAB® data type

  • MissingValue -- NULL value representation

  • MissingRows -- Vector of row indices that indicate the location of missing values

selectquery = 'SELECT * FROM Patients';

[data,metadata] = select(conn,selectquery)
data =

  10×10 table array

     LastName      Gender     Age             Location             Height    Weight    Smoker    Systolic    Diastolic    SelfAssessedHealthStatus
    __________    ________    ___    __________________________    ______    ______    ______    ________    _________    ________________________

    'Smith'       'Male'      38     'Country General Hospital'    -32768       176    true      124          93          'Excellent'             
    'Johnson'     'Male'      43     'VA Hospital'                     69       163    false     NaN          77          'Fair'                  
    'Williams'    'Female'    38     ''                                64       131    false     125          83          'Good'                  
    'Jones'       'Female'     0     'VA Hospital'                     67       133    false     117          75          'Fair'                  
    'Broen'       'Female'    49     'Country General Hospital'        64       119    false     122          80          'Good'                  
    'Davis'       'Female'    46     'St Mary's Medical Center'        68       142    false     121         NaN          'Good'                  
    'Miller'      'Female'    33     'VA Hospital'                     64       142    true      130          88          'Good'                  
    'Wilson'      'Male'      40     'VA Hospital'                 -32768       180    false     115          82          'Good'                  
    'Moore'       'Male'      28     'St Mary's Medical Center'        68    -32768    false     115          78          'Excellent'             
    'Taylor'      'Female'    31     'Country General Hospital'        68       132    false     NaN          86          'Excellent'             


metadata =

  10×3 table array

                                VariableType    MissingValue    MissingRows 
                                ____________    ____________    ____________

    LastName                    'char'          ''              [0×1 double]
    Gender                      'char'          ''              [0×1 double]
    Age                         'uint8'         [     0]        [         4]
    Location                    'char'          ''              [0×1 double]
    Height                      'int16'         [-32768]        [2×1 double]
    Weight                      'int16'         [-32768]        [         9]
    Smoker                      'logical'       [     0]        [0×1 double]
    Systolic                    'single'        [   NaN]        [2×1 double]
    Diastolic                   'double'        [   NaN]        [         6]
    SelfAssessedHealthStatus    'char'          ''              [0×1 double]

Retrieve indices that indicate the location of missing values in the Height variable using the metadata output argument.

values = metadata(5,3)
valuesindex = values.MissingRows{1}
values =

  table

              MissingRows 
              ____________

    Height    [2×1 double]


valuesindex =

     1
     8

Change the default value for missing data from -32768 to 0 using vector indexing.

data.Height(valuesindex) = 0;

View the imported data.

data.Height
ans =

  10×1 int16 column vector

    0
   69
   64
   67
   64
   68
   64
    0
   68
   68

Missing values appear as 0.

Close the database connection.

close(conn)

Input Arguments

collapse all

Database connection, specified as an ODBC connection object or JDBC connection object created using the database function.

SQL SELECT statement, specified as a character vector or string. The select function only executes SQL SELECT statements. To execute other SQL statements, use the exec function.

Example: 'SELECT * FROM inventoryTable'

Data Types: char | string

Name-Value Arguments

Specify optional pairs of arguments as Name1=Value1,...,NameN=ValueN, where Name is the argument name and Value is the corresponding value. Name-value arguments must appear after other arguments, but the order of the pairs does not matter.

Before R2021a, use commas to separate each name and value, and enclose Name in quotes.

Example: 'MaxRows',100,'QueryTimeOut',5 returns 100 rows of data and waits 5 seconds to execute the SQL SELECT statement.

Maximum number of rows to return, specified as the comma-separated pair consisting of 'MaxRows' and a positive numeric scalar. By default, the select function returns all rows from the executed SQL query. Use this name-value pair argument to limit the number of rows imported into MATLAB®.

Example: 'MaxRows',10

Data Types: double

SQL query timeout, specified as the comma-separated pair consisting of 'QueryTimeOut' and a positive numeric scalar. By default, the select function ignores the timeout value. Use this name-value pair argument to specify the number of seconds to wait for executing the SQL query selectquery.

Example: 'QueryTimeOut',15

Output Arguments

collapse all

Imported data, returned as a table. The rows of the table correspond to the rows of data returned from the executed SQL query selectquery. The variable names of the table specify the columns in the SQL query.

The select function returns date or time data as character vectors in the table. This function returns text as character vectors or a cell array of character vectors. Strings are not supported in the table.

If no data to import exists, then data is an empty table.

Information about imported data, returned as a table. The row names of metadata are variable names in data. This function stores each variable name in the metadata table as a cell array. metadata has these variable names:

  • VariableType — Data types of each variable in data

  • MissingValue — Representation of missing value for each variable in data

  • MissingRows — Vector of row indices that indicate locations of missing values for each variable in data

This table shows how MATLAB represents NULL values in the database by default after data import.

Database Data Type Default NULL Value

SIGNED TINYINT

-128

UNSIGNED TINYINT

0

SIGNED SMALLINT

-32768

UNSIGNED SMALLINT

0

SIGNED INT

-2147483648

UNSIGNED INT

0

SIGNED BIGINT

-9223372036854775808

UNSIGNED BIGINT

0

REAL

NaN

FLOAT

NaN

DOUBLE

NaN

DECIMAL

NaN

NUMERIC

NaN

Boolean

false

Date, time, or text

''

To change the NULL value representation in the imported data, replace the default value by looping through the imported data or using vector indexing.

Limitations

  • You cannot customize missing values in the output argument data using the select function. Index into the imported data using the metadata output argument instead.

  • The output argument data does not support cell and struct data types. The select function only supports table.

Alternative Functionality

Use the exec and fetch functions for full functionality when importing data. For differences between the select function and this alternative, see Data Import Using Database Explorer App or Command Line.

Version History

Introduced in R2017a