---
title: "SemanticImport"
language: "en"
type: "Symbol"
summary: "SemanticImport[file] attempts to import a file semantically to give a Dataset object. SemanticImport[file, type] attempts to interpret all elements in the file as being of the specified type. SemanticImport[file, {type1, type2, ...}] attempts to interpret elements in successive columns as being of the specified types. SemanticImport[file, <|col1 -> type1, col2 -> type2, ...|>] keeps only the columns coli specified by their positions or names. SemanticImport[file, typespec, form] puts the result in the specified form."
keywords: 
- WDF
- canonicalization
- scraping
- entity identification
- data normalization
- parsing
- spreadsheets
canonical_url: "https://reference.wolfram.com/language/ref/SemanticImport.html"
source: "Wolfram Language Documentation"
related_guides: 
  - 
    title: "Free-Form & External Input"
    link: "https://reference.wolfram.com/language/guide/FreeFormAndExternalInput.en.md"
  - 
    title: "Computation with Structured Datasets"
    link: "https://reference.wolfram.com/language/guide/ComputationWithStructuredDatasets.en.md"
  - 
    title: "WDF (Wolfram Data Framework)"
    link: "https://reference.wolfram.com/language/guide/WDFWolframDataFramework.en.md"
  - 
    title: "Knowledge Representation & Access"
    link: "https://reference.wolfram.com/language/guide/KnowledgeRepresentationAndAccess.en.md"
  - 
    title: "Scientific Data Analysis"
    link: "https://reference.wolfram.com/language/guide/ScientificDataAnalysis.en.md"
  - 
    title: "Setting Up Input Interpreters"
    link: "https://reference.wolfram.com/language/guide/InterpretingStrings.en.md"
  - 
    title: "Text Analysis"
    link: "https://reference.wolfram.com/language/guide/TextAnalysis.en.md"
related_workflows: 
  - 
    title: "Import a File"
    link: "https://reference.wolfram.com/language/workflow/ImportAFile.en.md"
  - 
    title: "Extract Columns in a Dataset"
    link: "https://reference.wolfram.com/language/workflow/ExtractColumnsInADataset.en.md"
related_functions: 
  - 
    title: "SemanticImportString"
    link: "https://reference.wolfram.com/language/ref/SemanticImportString.en.md"
  - 
    title: "SemanticInterpretation"
    link: "https://reference.wolfram.com/language/ref/SemanticInterpretation.en.md"
  - 
    title: "Interpreter"
    link: "https://reference.wolfram.com/language/ref/Interpreter.en.md"
  - 
    title: "WolframAlpha"
    link: "https://reference.wolfram.com/language/ref/WolframAlpha.en.md"
  - 
    title: "Import"
    link: "https://reference.wolfram.com/language/ref/Import.en.md"
  - 
    title: "Dataset"
    link: "https://reference.wolfram.com/language/ref/Dataset.en.md"
---
# SemanticImport
⚠ *Unsupported in Public Cloud*

SemanticImport[file] attempts to import a file semantically to give a Dataset object.

SemanticImport[file, type] attempts to interpret all elements in the file as being of the specified type.

SemanticImport[file, {type1, type2, …}] attempts to interpret elements in successive columns as being of the specified types. 

SemanticImport[file, <|col1 -> type1, col2 -> type2, …|>] keeps only the columns coli specified by their positions or names.

SemanticImport[file, typespec, form] puts the result in the specified form.

## Details and Options

* In ``SemanticImport[file]``, ``file`` can be specified as ``File["path"]`` or simply ``"path"``.

* ``SemanticImport`` is primarily intended for one- and two-dimensional arrays of elements.

* ``SemanticImport`` can use free-form linguistics to interpret elements in the structure it is given.

* Types of objects returned include numbers, ``Quantity`` objects, ``Entity`` objects, ``DateObject``, ``GeoPosition``, etc.

* ``SemanticImport`` makes detailed assumptions, for example about date formats, by looking at all elements in particular rows or columns of the input.

* Possible values for ``type`` include:

|                  |                                              |
| ---------------- | -------------------------------------------- |
| Automatic        | choose type automatically                    |
| "String"         | Unicode string                               |
| "Number"         | number in any standard format                |
| "Integer"        | integer in decimal notation                  |
| "Real"           | real in decimal notation                     |
| "Quantity"       | quantity with units                          |
| "Currency"       | currency amount                              |
| "Date"           | date in any standard format                  |
| "DateTime"       | date and time                                |
| "Time"           | time of day                                  |
| "GeoCoordinates" | geo position specifed as latitude, longitude |
| "URL"            | correctly formatted URL                      |
| "EmailAddress"   | correctly formatted email address            |
| "Country"        | country given in natural language            |
| "City"           | city given in natural language               |
| None             | skip a column                                |
| ispec            | any basic form used by Interpreter           |

* The following options can be given to indicate features of the input:

|                    |           |                                                     |
| ------------------ | --------- | --------------------------------------------------- |
| CharacterEncoding  | Automatic | assumed encoding of input file                      |
| Delimiters         | Automatic | delimiters between elements                         |
| HeaderLines        | Automatic | line numbers to treat as headers                    |
| ExcludedLines      | {}        | lines to exclude from result                        |
| MissingDataRules   | {}        | rules for replacing data to be considered "missing" |

* Possible values for ``form`` include:

|                |                                                                          |
| -------------- | ------------------------------------------------------------------------ |
| "Dataset"      | a row-oriented dataset                                                   |
| "List"         | a single column as a list                                                |
| "Columns"      | a list of columns, each given as a list                                  |
| "NamedColumns" | an association associating column name with list of contents             |
| "Rows"         | a list of rows, each given as a list                                     |
| "NamedRows"    | a list of rows, each given as an association from column name to content |

* When elements cannot be interpreted, forms returned in their place include:

|                                   |                                                 |
| --------------------------------- | ----------------------------------------------- |
| Missing["Empty"]                  | an empty or whitespace element                  |
| Missing["Invalid", "string"]      | data with invalid or meaningless fields         |
| Missing["Unrecognized", "string"] | element that could not be parsed                |
| Missing["ByDesignation", value]   | an element matching MissingDataRules            |
| Missing[custom]                   | a Missing[…] provided through MissingDataRules  |

---

## Examples (26)

### Basic Examples (7)

Import a file, automatically detecting and interpreting dates and cities:

```wl
In[1]:= sales = SemanticImport["ExampleData/RetailSales.tsv"]

Out[1]= Dataset[<>]
```

Columns shown in bold correspond to semantic objects in the Wolfram Language:

```wl
In[2]:= sales[1, "Date"]

Out[2]= DateObject[{2014, 1, 1}, "Day", "Gregorian", -5.]

In[3]:= sales[2, "City"]

Out[3]= Entity["City", {"NewYork", "NewYork", "UnitedStates"}]

In[4]:= %["Population"]

Out[4]= Quantity[8537673, "People"]
```

---

Import a file with the specified column types:

```wl
In[1]:= SemanticImport["ExampleData/RetailSales.tsv", {"Date", "City", "Integer"}]

Out[1]= Dataset[<>]
```

---

Import only some columns of a file, in the specified format, using column numbers:

```wl
In[1]:= SemanticImport["ExampleData/RetailSales.tsv", <|1 -> "Date", 3 -> Automatic|>]

Out[1]= Dataset[<>]
```

---

Import only some columns of a file, in the specified format, using column names:

```wl
In[1]:= SemanticImport["ExampleData/RetailSales.tsv", <|"Date" -> "Date", "Sales" -> Automatic|>]

Out[1]= Dataset[<>]
```

---

Import only some columns, specifying ``None`` for columns that should be dropped:

```wl
In[1]:= SemanticImport["ExampleData/RetailSales.tsv", {None, "City", "Integer"}]

Out[1]= Dataset[<>]
```

---

Import a file as a list of rows:

```wl
In[1]:= SemanticImport["ExampleData/RetailSales.tsv", Automatic, "Rows"][[ ;; 5]]

Out[1]= {{DateObject[{2014, 1, 1}, "Day", "Gregorian", -5.], Entity["City", {"Boston", "Massachusetts", "UnitedStates"}], 198}, {DateObject[{2014, 1, 1}, "Day", "Gregorian", -5.], Entity["City", {"NewYork", "NewYork", "UnitedStates"}], 220}, {DateObject[{2 ...  "France"}], 215}, {DateObject[{2014, 1, 1}, "Day", "Gregorian", -5.], Entity["City", {"London", "GreaterLondon", "UnitedKingdom"}], 225}, {DateObject[{2014, 1, 1}, "Day", "Gregorian", -5.], Entity["City", {"Shanghai", "Shanghai", "China"}], 241}}
```

---

Import a file as a list of columns:

```wl
In[1]:= {dates, cities, sales} = SemanticImport["ExampleData/RetailSales.tsv", Automatic, "Columns"];

In[2]:= dates[[ ;; 5]]

Out[2]= {DateObject[{2014, 1, 1}, "Day", "Gregorian", -5.], DateObject[{2014, 1, 1}, "Day", "Gregorian", -5.], DateObject[{2014, 1, 1}, "Day", "Gregorian", -5.], DateObject[{2014, 1, 1}, "Day", "Gregorian", -5.], DateObject[{2014, 1, 1}, "Day", "Gregorian", -5.]}

In[3]:= cities[[ ;; 5]]

Out[3]= {Entity["City", {"Boston", "Massachusetts", "UnitedStates"}], Entity["City", {"NewYork", "NewYork", "UnitedStates"}], Entity["City", {"Paris", "IleDeFrance", "France"}], Entity["City", {"London", "GreaterLondon", "UnitedKingdom"}], Entity["City", {"Shanghai", "Shanghai", "China"}]}

In[4]:= sales[[ ;; 5]]

Out[4]= {198, 220, 215, 225, 241}
```

### Scope (3)

Import a file using a given character encoding:

```wl
In[1]:= SemanticImport["ExampleData/UnicodeRetailSales.tsv", CharacterEncoding -> "Unicode"]

Out[1]= Dataset[<>]
```

---

Import a file using the given delimiter:

```wl
In[1]:= SemanticImport["ExampleData/dpkg.log", Delimiters -> " "]

Out[1]= Dataset[<>]
```

---

Specify that the first line of the file to import is a header:

```wl
In[1]:= SemanticImport["ExampleData/buildings.dat", HeaderLines -> 1]

Out[1]= Dataset[<>]
```

Specify that the first and fifth lines of a file should be skipped:

```wl
In[2]:= SemanticImport["ExampleData/buildings.dat", ExcludedLines -> {1, 5}]

Out[2]= Dataset[<>]
```

Return missing values with the form ``"Unknown"`` in the special form ``Missing["UnknownData"]`` :

```wl
In[3]:= SemanticImport["ExampleData/RetailSalesMissings.tsv", MissingDataRules -> {"Unknown" -> Missing["UnknownData"]}]

Out[3]= Dataset[<>]
```

### Options (7)

``SemanticImport`` uses many of the same options as ``SemanticImportString``. See ``SemanticImportString`` for more examples.

#### CharacterEncoding (1)

The wrong character encoding can derail a good interpretation. Create a file of Unicode-encoded data:

```wl
In[1]:=
path = FileNameJoin[{$TemporaryDirectory, "UnicodeBuildings.dat"}];
Export[path, Import["ExampleData/buildings.dat", "Text"], "Text", CharacterEncoding -> "Unicode"];
```

Import the data using the default character encoding:

```wl
In[2]:= Take[SemanticImport[path, Automatic, "Rows", HeaderLines -> 1], 5]

Out[2]= {{""}, {"1"}, {" "}, {"|"}, {" "}}
```

Import the data, specifying that it is encoded as Unicode:

```wl
In[3]:= SemanticImport[path, Automatic, "Rows", "CharacterEncoding" -> "Unicode", HeaderLines -> 1]//Take[#, 5]&

Out[3]= {{1, "Taipei 101", Entity["City", {"TaipeiCity", "TaipeiCity", "Taiwan"}], Entity["Country", "Taiwan"], 2004, 101, 508}, {2, "Petronas Tower 1", Entity["City", {"KualaLumpur", "KualaLumpur", "Malaysia"}], Entity["Country", "Malaysia"], 1998, 88, 45 ...  {4, "Sears Tower", Entity["City", {"Chicago", "Illinois", "UnitedStates"}], Entity["Country", "UnitedStates"], 1974, 110, 442}, {5, "Jin Mao Building", Entity["City", {"Shanghai", "Shanghai", "China"}], Entity["Country", "China"], 1999, 88, 421}}
```

#### Delimiters (1)

Specifying the delimiter determines how the values are separated:

```wl
In[1]:= SemanticImport["ExampleData/buildings.dat", Automatic, "Rows", Delimiters -> "|", HeaderLines -> 1]//Take[#, 5]&

Out[1]= {{1, "Taipei 101", Entity["City", {"TaipeiCity", "TaipeiCity", "Taiwan"}], Entity["Country", "Taiwan"], 2004, 101, 508}, {2, "Petronas Tower 1", Entity["City", {"KualaLumpur", "KualaLumpur", "Malaysia"}], Entity["Country", "Malaysia"], 1998, 88, 45 ...  {4, "Sears Tower", Entity["City", {"Chicago", "Illinois", "UnitedStates"}], Entity["Country", "UnitedStates"], 1974, 110, 442}, {5, "Jin Mao Building", Entity["City", {"Shanghai", "Shanghai", "China"}], Entity["Country", "China"], 1999, 88, 421}}
```

Specifying a nonexistent delimiter gives a single column of newline-separated items:

```wl
In[2]:= SemanticImport["ExampleData/buildings.dat", Automatic, "Rows", Delimiters -> ",", HeaderLines -> 1]//InputForm
```

Out[2]//InputForm=
{{"1 \| Taipei 101 \| Taipei \| Taiwan \| 2004 \| 101 \| 508"}, 
 {"2 \| Petronas Tower 1 \| Kuala Lumpur \| Malaysia \| 1998 \| 88 \| 452"}, 
 {"3 \| Petronas Tower 2 \| Kuala Lumpur \| Malaysia \| 1998 \| 88 \| 452"}, 
 {"4 \| Sears Tower \| Chic ...  China \| 2005 \| 60 \| 333"}, 
 {"18 \| Minsheng Bank Building \| Wuhan \| China \| 2006 \| 68 \| 331"}, 
 {"19 \| Ryugyong Hotel \| Pyongyang \| North Korea \| 1995 \| 105 \| 330"}, 
 {"20 \| Q1 \| Gold Coast \| Australia \| 2005 \| 78 \| 323"}}

#### ExcludedLines (1)

Lines are excluded by row number prior to header selection or further processing. Here is raw data:

```wl
In[1]:= FilePrint["ExampleData/buildings.dat"]

Rank | Name | City | Country | Year | Stories | Height
1 | Taipei 101 | Taipei | Taiwan | 2004 | 101 | 508
2 | Petronas Tower 1 | Kuala Lumpur | Malaysia | 1998 | 88 | 452
3 | Petronas Tower 2 | Kuala Lumpur | Malaysia | 1998 | 88 | 452
4 | Sears Tower | Chicago | United States | 1974 | 110 | 442
5 | Jin Mao Building | Shanghai | China | 1999 | 88 | 421
6 | Two International Finance Centre | Hong Kong | China | 2003 | 88 | 415
7 | CITIC Plaza | Guangzhou | China | 1996 | 80 | 391
8 | Shun Hing Square | Shenzhen | China | 1996 | 69 | 384
9 | Empire State Building | New York | United States | 1931 | 102 | 381
10 | Central Plaza | Hong Kong | China | 1992 | 78 | 374
11 | Bank of China | Hong Kong | China | 1989 | 70 | 367
12 | Emirates Tower One | Dubai | United Arab Emirates | 1999 | 54 | 355
13 | Tuntex Sky Tower | Kaohsiung | Taiwan | 1997 | 85 | 348
14 | Aon Centre | Chicago | United States | 1973 | 80 | 346
15 | The Center | Hong Kong | China | 1998 | 73 | 346
16 | John Hancock Center | Chicago | United States | 1969 | 100 | 344
17 | Shimao International Plaza | Shanghai | China | 2005 | 60 | 333
18 | Minsheng Bank Building | Wuhan | China | 2006 | 68 | 331
19 | Ryugyong Hotel | Pyongyang | North Korea | 1995 | 105 | 330
20 | Q1 | Gold Coast | Australia | 2005 | 78 | 323
```

Excluding even line numbers gives the odd-ranked buildings, since the header line puts odd ranks on even lines:

```wl
In[2]:= SemanticImport["ExampleData/buildings.dat", Automatic, "NamedColumns", ExcludedLines -> Select[Range[40], EvenQ], HeaderLines -> 1]

Out[2]= <|"Rank" -> {2, 4, 6, 8, 10, 12, 14, 16, 18, 20}, "Name" -> {"Petronas Tower 1", "Sears Tower", "Two International Finance Centre", "Shun Hing Square", "Central Plaza", "Emirates Tower One", "Aon Centre", "John Hancock Center", "Minsheng Bank Build ... "Country", "China"], Entity["Country", "Australia"]}, "Year" -> {1998, 1974, 2003, 1996, 1992, 1999, 1973, 1969, 2006, 2005}, "Stories" -> {88, 110, 88, 69, 78, 54, 80, 100, 68, 78}, "Height" -> {452, 442, 415, 384, 374, 355, 346, 344, 331, 323}|>
```

#### HeaderLines (1)

Specify the number of lines in the file to treat as a header:

```wl
In[1]:= SemanticImport["ExampleData/buildings.dat", Automatic, "NamedColumns", HeaderLines -> 0]//Keys

Out[1]= {"column1", "column2", "column3", "column4", "column5", "column6", "column7"}

In[2]:= SemanticImport["ExampleData/buildings.dat", Automatic, "NamedColumns", HeaderLines -> 1]//Keys

Out[2]= {"Rank", "Name", "City", "Country", "Year", "Stories", "Height"}

In[3]:= SemanticImport["ExampleData/buildings.dat", Automatic, "NamedColumns", HeaderLines -> 2]//Keys

Out[3]= {"Rank 1", "Name Taipei 101", "City Taipei", "Country Taiwan", "Year 2004", "Stories 101", "Height 508"}
```

#### MissingDataRules (2)

Replace strings that start with "Sears" by "Willis Tower":

```wl
In[1]:= SemanticImport["ExampleData/buildings.dat", Automatic, "NamedColumns", MissingDataRules -> {("Sears" ~~ ___) -> "Willis Tower"}, HeaderLines -> 1]["Name"]

Out[1]= {"Taipei 101", "Petronas Tower 1", "Petronas Tower 2", Missing["ByDesignation", "Willis Tower"], "Jin Mao Building", "Two International Finance Centre", "CITIC Plaza", "Shun Hing Square", "Empire State Building", "Central Plaza", "Bank of China", "Emirates Tower One", "Tuntex Sky Tower", "Aon Centre", "The Center", "John Hancock Center", "Shimao International Plaza", "Minsheng Bank Building", "Ryugyong Hotel", "Q1"}
```

---

Rules are applied before interpretation:

```wl
In[1]:= SemanticImport["ExampleData/buildings.dat", Automatic, "NamedColumns", MissingDataRules -> {"United" ~~ ___ -> Missing["United country caught before interpretation"]}, "HeaderLines" -> 1]["Country"]

Out[1]= {Entity["Country", "Taiwan"], Entity["Country", "Malaysia"], Entity["Country", "Malaysia"], Missing["United country caught before interpretation"], Entity["Country", "China"], Entity["Country", "China"], Entity["Country", "China"], Entity["Country" ... ed country caught before interpretation"], Entity["Country", "China"], Missing["United country caught before interpretation"], Entity["Country", "China"], Entity["Country", "China"], Entity["Country", "NorthKorea"], Entity["Country", "Australia"]}
```

### Applications (6)

Import a table containing the flight cost from London to many countries as a ``Dataset`` object:

```wl
In[1]:= data = SemanticImport["ExampleData/countries-currency"]

Out[1]= Dataset[<>]
```

Get the geographic position of London:

```wl
In[2]:= london = CityData["London", "Coordinates"]

Out[2]= {51.5, -0.116667}
```

Get the maximum price of a flight:

```wl
In[3]:= maxPrice = Max[data[[All, "Price"]]]

Out[3]= Quantity[1844, "USDollars"]
```

Make a map showing the least expensive flight routes in blue and the most expensive ones in orange:

```wl
In[4]:=
GeoGraphics[{AbsoluteThickness[2], Normal[{
      Blend[{Blue, Orange}, #Price / maxPrice], 
      GeoPath[{london, CountryData[#"Flight Costs", "CenterCoordinates"]}]
      }& /@ data[SortBy[Key["Price"]]]]}]

Out[4]= [image]
```

---

Import the data for a timeline of personal emails:

```wl
In[1]:= data = SemanticImport["ExampleData/dates-categories"]

Out[1]= Dataset[<>]
```

Get the values that are in the "family" category:

```wl
In[2]:= family = data[Select[#[[2]] == "family"&]]

Out[2]= Dataset[<>]
```

Plot email count per month:

```wl
In[3]:= DateListPlot@CountsBy[family, DateValue[First[#], {"Year", "Month"}]&]

Out[3]= [image]
```

---

Import the first and third columns from a table of salaries for college faculty members:

```wl
In[1]:= data = SemanticImport["ExampleData/categories-numbers", <|"Salary" -> Automatic, "Rank" -> Automatic|>]

Out[1]= Dataset[<>]
```

Plot the result:

```wl
In[2]:= ListPlot[data, AxesLabel -> {"Salary", "Rank"}]

Out[2]= [image]
```

---

Import a dataset consisting of dates and numeric values as a ``Dataset`` object:

```wl
In[1]:= SemanticImport["ExampleData/financialtimeseries.csv"]

Out[1]= Dataset[<>]
```

Obtain the data as a list of rows:

```wl
In[2]:= SemanticImport["ExampleData/financialtimeseries.csv", Automatic, "Rows"]//Short

Out[2]//Short= {{DateObject[{2006, 1, 3}, "Day", "Gregorian", -5.], 11.82}, «249», {DateObject[{2006, 12, 29}, "Day", "Gregorian", -5.], 13.91}}
```

Specify that dates should be interpreted as strings:

```wl
In[3]:= SemanticImport["ExampleData/financialtimeseries.csv", {"String", Automatic}, "Rows"]//Short

Out[3]//Short= {{"Jan 03 2006", 11.82}, {"Jan 04 2006", 12.04}, «247», {"Dec 28 2006", 14.01}, {"Dec 29 2006", 13.91}}
```

---

Import a dataset containing a list of famous buildings and their properties as a ``Dataset`` object. Cities and countries are automatically detected as ``Entity`` objects:

```wl
In[1]:= SemanticImport["ExampleData/buildings.dat", "HeaderLines" -> 1]

Out[1]= Dataset[<>]
```

---

Import only the Name, Country, and Height columns of the famous building dataset:

```wl
In[1]:= SemanticImport["ExampleData/buildings.dat", <|"Name" -> Automatic, "Country" -> Automatic, "Height" -> Automatic|>, HeaderLines -> 1]

Out[1]= Dataset[<>]
```

### Possible Issues (3)

Automatic selection chooses from a less rich set of types than ``Interpreter`` :

```wl
In[1]:= SemanticImport["ExampleData/buildings.dat", Automatic, "NamedColumns", HeaderLines -> 1]["Name"]

Out[1]= {"Taipei 101", "Petronas Tower 1", "Petronas Tower 2", "Sears Tower", "Jin Mao Building", "Two International Finance Centre", "CITIC Plaza", "Shun Hing Square", "Empire State Building", "Central Plaza", "Bank of China", "Emirates Tower One", "Tuntex Sky Tower", "Aon Centre", "The Center", "John Hancock Center", "Shimao International Plaza", "Minsheng Bank Building", "Ryugyong Hotel", "Q1"}
```

Specify explicit types to import ``Entity`` objects rather than strings:

```wl
In[2]:= SemanticImport["ExampleData/buildings.dat", {"Integer", "Building", "City", "Country", "Date", "Integer", "Integer"}, "NamedColumns", HeaderLines -> 1]["Name"]

Out[2]= {Entity["Building", "Taipei101::9b82t"], Entity["Building", "PetronasTower1::245r5"], Entity["Building", "PetronasTower2::k6grj"], Entity["Building", "WillisTower::8tzqg"], Entity["Building", "JinMaoTower::b97bc"], Entity["Building", "TwoInternatio ... , "The Center"], Entity["Building", "JohnHancockCenter::97pzs"], Entity["Building", "ShimaoInternationalPlaza::w72t4"], Entity["Building", "MinshengBankBuilding::8qc6n"], Entity["Building", "RyugyongHotel::8qpb9"], Entity["Building", "Q1::mzrr3"]}
```

---

An ``Automatic`` type specifies an automatically selected number of columns:

```wl
In[1]:= SemanticImport["ExampleData/buildings.dat", Automatic, "Rows", HeaderLines -> 1]//Take[#, 5]&

Out[1]= {{1, "Taipei 101", Entity["City", {"TaipeiCity", "TaipeiCity", "Taiwan"}], Entity["Country", "Taiwan"], 2004, 101, 508}, {2, "Petronas Tower 1", Entity["City", {"KualaLumpur", "KualaLumpur", "Malaysia"}], Entity["Country", "Malaysia"], 1998, 88, 45 ...  {4, "Sears Tower", Entity["City", {"Chicago", "Illinois", "UnitedStates"}], Entity["Country", "UnitedStates"], 1974, 110, 442}, {5, "Jin Mao Building", Entity["City", {"Shanghai", "Shanghai", "China"}], Entity["Country", "China"], 1999, 88, 421}}
```

An ``{Automatic}`` type specifies a single column of automatically selected type:

```wl
In[2]:= SemanticImport["ExampleData/buildings.dat", {Automatic}, "Rows", HeaderLines -> 1]//Take[#, 5]&//InputForm
```

Out[2]//InputForm=
{{"1 \| Taipei 101 \| Taipei \| Taiwan \| 2004 \| 101 \| 508"}, 
 {"2 \| Petronas Tower 1 \| Kuala Lumpur \| Malaysia \| 1998 \| 88 \| 452"}, 
 {"3 \| Petronas Tower 2 \| Kuala Lumpur \| Malaysia \| 1998 \| 88 \| 452"}, 
 {"4 \| Sears Tower \| Chicago \| United States \| 1974 \| 110 \| 442"}, 
 {"5 \| Jin Mao Building \| Shanghai \| China \| 1999 \| 88 \| 421"}}

``Automatic`` in a type list applies to the corresponding column sequentially:

```wl
In[3]:= SemanticImport["ExampleData/buildings.dat", {Automatic, Automatic, Automatic, Automatic, Automatic, Automatic, Automatic}, "Rows", HeaderLines -> 1]//Take[#, 5]&

Out[3]= {{1, "Taipei 101", Entity["City", {"TaipeiCity", "TaipeiCity", "Taiwan"}], Entity["Country", "Taiwan"], 2004, 101, 508}, {2, "Petronas Tower 1", Entity["City", {"KualaLumpur", "KualaLumpur", "Malaysia"}], Entity["Country", "Malaysia"], 1998, 88, 45 ...  {4, "Sears Tower", Entity["City", {"Chicago", "Illinois", "UnitedStates"}], Entity["Country", "UnitedStates"], 1974, 110, 442}, {5, "Jin Mao Building", Entity["City", {"Shanghai", "Shanghai", "China"}], Entity["Country", "China"], 1999, 88, 421}}
```

---

The default ``Automatic`` selection of header lines can be incorrect, depending on whether data is organized in rows or columns:

```wl
In[1]:= SemanticImport["ExampleData/buildings.dat", Automatic, "Rows"]//Take[#, 2]&

Out[1]= {{Missing["Unrecognized", "Rank"], "Name", Missing["Unrecognized", "City"], Missing["Unrecognized", "Country"], Missing["Unrecognized", "Year"], Missing["Unrecognized", "Stories"], Missing["Unrecognized", "Height"]}, {1, "Taipei 101", Entity["City", {"TaipeiCity", "TaipeiCity", "Taiwan"}], Entity["Country", "Taiwan"], 2004, 101, 508}}
```

Specify the number of header lines explicitly to import the data correctly:

```wl
In[2]:= SemanticImport["ExampleData/buildings.dat", Automatic, "Rows", "HeaderLines" -> 1]//Take[#, 2]&

Out[2]= {{1, "Taipei 101", Entity["City", {"TaipeiCity", "TaipeiCity", "Taiwan"}], Entity["Country", "Taiwan"], 2004, 101, 508}, {2, "Petronas Tower 1", Entity["City", {"KualaLumpur", "KualaLumpur", "Malaysia"}], Entity["Country", "Malaysia"], 1998, 88, 452}}
```

## See Also

* [`SemanticImportString`](https://reference.wolfram.com/language/ref/SemanticImportString.en.md)
* [`SemanticInterpretation`](https://reference.wolfram.com/language/ref/SemanticInterpretation.en.md)
* [`Interpreter`](https://reference.wolfram.com/language/ref/Interpreter.en.md)
* [`WolframAlpha`](https://reference.wolfram.com/language/ref/WolframAlpha.en.md)
* [`Import`](https://reference.wolfram.com/language/ref/Import.en.md)
* [`Dataset`](https://reference.wolfram.com/language/ref/Dataset.en.md)

## Related Guides

* [Free-Form & External Input](https://reference.wolfram.com/language/guide/FreeFormAndExternalInput.en.md)
* [Computation with Structured Datasets](https://reference.wolfram.com/language/guide/ComputationWithStructuredDatasets.en.md)
* [WDF (Wolfram Data Framework)](https://reference.wolfram.com/language/guide/WDFWolframDataFramework.en.md)
* [Knowledge Representation & Access](https://reference.wolfram.com/language/guide/KnowledgeRepresentationAndAccess.en.md)
* [Scientific Data Analysis](https://reference.wolfram.com/language/guide/ScientificDataAnalysis.en.md)
* [Setting Up Input Interpreters](https://reference.wolfram.com/language/guide/InterpretingStrings.en.md)
* [Text Analysis](https://reference.wolfram.com/language/guide/TextAnalysis.en.md)

## Related Workflows

* [Import a File](https://reference.wolfram.com/language/workflow/ImportAFile.en.md)
* [Extract Columns in a Dataset](https://reference.wolfram.com/language/workflow/ExtractColumnsInADataset.en.md)

## Related Links

* [An Elementary Introduction to the Wolfram Language: Datasets](https://www.wolfram.com/language/elementary-introduction/45-datasets.html)

## History

* [Introduced in 2014 (10.0)](https://reference.wolfram.com/language/guide/SummaryOfNewFeaturesIn100.en.md) \| [Updated in 2016 (11.0)](https://reference.wolfram.com/language/guide/SummaryOfNewFeaturesIn110.en.md)