Structured arrays in NumPy allow you to represent complex data structures with heterogeneous fields, each having a specific data type. This provides a powerful and flexible way to store and manipulate structured data.
Creating Structured Arrays
To create a structured array, you need to define a dtype that specifies the names and data types of each field.
Python
import numpy as np
dtype = np.dtype([('name', np.str_, 20), ('age', np.int32), ('salary', np.float64)])
data = np.array([('Alice', 30, 100000), ('Bob', 25, 80000)], dtype=dtype)
Accessing Fields
You can access individual fields of a structured array using field names:
Python
names = data['name']
ages = data['age']
Structured Array Operations
Many NumPy operations can be applied to structured arrays, including:
- Indexing and slicing: Accessing specific elements or subsets of the array.
- Arithmetic operations: Performing element-wise arithmetic on fields of compatible data types.
- Aggregation functions: Calculating statistics like sum, mean, and standard deviation.
- Sorting: Sorting the array based on the values of a specific field.
Advanced Features
- Nested Structures: Creating structured arrays within other structured arrays.
- Record Arrays: A simplified way to create structured arrays with field names as attributes.
- Field Access Using Integers: Accessing fields using integer indices.
Example: Employee Database
Python
dtype = np.dtype([('employee_id', np.int32), ('name', np.str_, 30), ('department', np.str_, 20), ('salary', np.float64)])
employees = np.array([
(1, 'Alice', 'Sales', 50000),
(2, 'Bob', 'Marketing', 60000),
(3, 'Charlie', 'Engineering', 70000)
], dtype=dtype)
# Access specific fields
employee_ids = employees['employee_id']
names = employees['name']
# Sort by salary
sorted_employees = np.sort(employees, order='salary')
Structured arrays provide a versatile and efficient way to represent and manipulate complex data in NumPy. They are particularly useful for working with tabular data, databases, and scientific datasets.