Oracle Tables
A table consists of rows and columns and corresponds to a single record. When creating a table you give it a name and define the columns that belong to it. You can specify the width or the precision and scale for certain columns, and some of the columns can contain default values.Oracle uses a number of different data types for columns, for more information click here.
Dual Table
The dual table belongs to the sys schema and is created automatically when the data dictionary is created. Everything in oracle has to be in a table even results from a arithmetical expression have to be in a table, a query that retrieves those results needs a table to use and the dual table serves as a catchall table for those's expressions.
Oracle tables
There are four major types of tables in oracle
| Heaped Organised Tables | This is a normal standard table. Data is managed in a heap like fashion. A heap is a bunch of space and it is used in a somewhat random fashion. |
| Index Organised Tables | Here, a table is stored in an index structure (B-Tree). The rows themselves are in order unlike a heaped table. The order is determined by the primary key. |
| Clustered Tables | Many tables may be stored and joined together, many tables may be stored on the same data block, also data that contains the same cluster key value will be physically stored together. The data is clustered around the cluster key value. A cluster key is built using a B-Tree index. |
| Partitioned tables | You can divide a large amount of data into subtables called partitions, according various criteria. Partitioning is especially useful in data warehousing. |
Oracle uses a namespace which means that the same name for a object within the same namespace cannot be used, below is a list of the namespaces and where object names must be different.
| namespace1 | tables, views, sequences, private synonymous, procedures, functions, package, materialized views, user-defined types |
| namespace2 | tablespaces, indexes, constraints, clusters, triggers, database links, dimensions, roles, public synonymous, profiles and pfiles. |
Before you create any table it is a good idea to have a estimate of what the size of the table is going to be, this allows you to make the right decision about space allocation. Oracle 10g has a table-size estimation tool within the OEM or you can use the DBMS_SPACE package.
| DBMS_PACKAGE | DECLARE l_used_bytes NUMBER(10); l_allocated_bytes NUMBER(10); BEGIN DBMS_SPACE.CREATE_TABLE_COST( tablespace_name=>'TEST', avg_row_size=>120, row_count=>1000000, pct_free=>10, used_bytes=>l_used_bytes, alloc_bytes=>l_allocated_bytes); DBMS_OUTPUT.PUT_LINE('used = ' || l_used_bytes || ' bytes ' || 'allocated = ' || l_allocated_bytes || ' bytes'); END; / |
| OEM | Admin tab -> Tables (in schema list) -> Create Button -> select table type -> enter the data -> enter estimated number of rows |
High Water Mark
If you view a table as a flat structure as a series of blocks laid one after the other in a line from left to right, the high water mark would the right most block that ever contained data. Over a period of time the high water mark rises with the amount of data added. However if data was deleted there may be empty blocks under the high water mark until the object is rebuilt or truncated using the TRUNCATE command when deleting rows.
The high water mark is relevant since Oracle will scan all blocks under the mark, even if they do not contain data, during a full scan. To see how you can reduce the HWM see automatic segment space management.
Freelist
The freelist is where Oracle keeps track of blocks under the high water mark for objects that have free space on them. Each object will have at least one freelist associated with it. You can increase Oracle performance by creating addition freelists on objects to reduce contention on objects. There are four parameters that can improve performance:
| FREELIST | every table manages the blocks it has allocated in the heap on a freelist. A table may have more than one freelist. Increase the number of freelists if a table is heavily used. |
| PCTFREE | a measure of how full a block can be made during a insert process. once the block has less than pctfree it will no longer be used for inserts. |
| PCTUSED | a measure of how empty a block must become, before it will be used for inserts again, once the block has less than PCUSED it will be available for insets again. |
| INITRANS | The number of transaction slots initially allocated to a block. If set to low this can cause concurrency issues in a block that is used by many users. see locking for more information |
Row migration is when a row is forced to leave the block it was created on due to insufficient space on the block. The row is migrated to another block and a pointer placed on the original block pointing to the new location. Having to many row migrations lead to poor performance as Oracle is required to perform one additional read of the pointer.
Heap Organized Tables
This is the standard oracle table and data is managed in a heaped fashion (no order), When a object table is created there is a bit of magic that happens behind the scenes, as you get hidden columns, extra indexes, pseudo columns, etc.
Table
|
|
| Creating | create table emp ( empno number(5) primary key, ename varchar2(15) not null, national_insurance varchar2(9) not null, job varchar2(25), manager number(5), deptno number(3) not null constraint dept_fkey references hr.dept(dept_id)) tablespace users; |
| Creating with CTAS | create table emp_new as select * from emp parallel degree 4 nologging; Note: CTAS = create table as select, load data using multiple processes, do not log the changes to the redo logs(minimum information will always be logged). |
| Removing (restorable) | drop table emp cascade constraints; Note: The table can be retrieved by the 'flashback table' command |
| Removing (permanently) | drop table emp purge; |
| Renaming a table | alter table emp rename to employees; |
| Moving a table | alter table emp move new_tablespace; Note: do this when you want the change any storage parameters, also remember that the row ids will change thus indexes will have to be re-created or rebuilt. |
| Emptying a table | trucate table emp; Note: there is no rollback from this command its a DDL command. |
| Restoring a table after a drop | flashback table emp to before drop; Note: see flashback table, this will not bring back a truncated table. |
Columns
|
|
| adding columns | alter table emp add(retired char(1)); |
| removing columns | alter table emp drop(retired); |
| Marking column/s as unused | alter table emp set unused (retired); select * from user_unused_col_tabs; Note: once a column is marked as unsed there is no way to get it back othet than a restore. |
| removing unused columns | alter table emp drop unused columns; |
| Removing unused columns and checkpointing | alter table emp drop unused columns checkpoint 10000; Note: this stops the undo tablespace from filling up by checkpointing after every 10,000 rows. |
| Renaming a column | alter table emp rename retired to dead; |
Rows
|
|
| Count number of rows | select count(*) from emp; |
| Delete duplicate rows |
delete from t where rowid in (select rid
from (select rowid rid, row_number() over (partition by cust_seg_nbr order by rowid) rn from t )where rn <> 1 ) / |
Temporary tables are used to hold result sets, either for the duration of a transaction or a session. The data is held in a temporary table is only ever visible to the current session - no other session will ever see any other session's data. A session will never block another session using their temporary table. Also remember that a temporary table will generate some undo information which in turn generates some redo log information. The benefits of using a temporary table are
- Temporary tables reduce the amount of redo activity generated by transactions
- Temporary tables can be indexed to improve performance
- Sessions can update, insert and delete in temporary tables just like normal tables
- Data is automatically removed after session or transaction
- Table constraints can be used on temporary tables
- Temporary tables can be used by a number of users, all seeing their data only.
- Minimal amount of locking of temporary tables means more efficient query processing
- The table structure of the table persists after the data is removed, so the table can be used again.
- They cannot have referential integrity constraints - they can neither be the target of a foreign key, or may have a foreign key defined on them.
- They cannot have VARRY or NESTED TABLE type columns
- They cannot be index organized tables (IOT)
- They cannot be in an index or hash cluster
- They cannot be partitioned
- They cannot have statistics generated via the ANALYZE table command.
- Lasts for the duration of the transaction - ON COMMIT DELETE ROWS
- Lasts for the duration of the session - ON COMMIT PERSERVE ROWS
| Listing | select table_name, temporary from user_tables where temporary = 'Y'; |
| Creating (session table) | create global temporary table flight_status ( destination varchar2(30), start_date date, return_date date, ticket_price number) on commit preserve rows; |
| Creating (transaction table) | create global temporary table flight_status ( destination varchar2(30), start_date date, return_date date, ticket_price number) on commit delete rows; |
| Creating a Temp table with CTAS | create global temporary table temp_test on commit delete rows as select * from all_objects where 1=0; Note: the 'on commit delete rows' is stated before the CTAS |
| Removing | drop table flight_status; Note: when you drop a temporary table it does not go into the recyclebin |
IOT is a table stored in an index structure (B-Tree), an IOT is stored in a ordered fashion sorted by its primary key. When using overflow for a IOT additional data that cannot fit into the index is stored in a overflow segment, a pointer from the index pointed to the additional data in the row. Getting the right mix of data on the index block versus data in the overflow segment is the most critical part of a IOT setup. Consider the freelist as this can affect the table. pctfree is not that important and pctused doesn't come into play normally. When using a overflow segment pctfree and pctused have the same meaning as they did for a heaped tabled.
The differences between a regular oracle table and a IOT table
Regular Oracle table
|
IOT
|
| Physical ROWIDs | Logical ROWIDs |
| Uniquely identified by ROWID | Uniquely identified by primary key |
| Unique constraints allowed | Unique constraints not allowed |
| Can contain LONG and LOB data | Can't contain LONG data |
| Allowed in table clusters | Not allowed in table clusters |
| Larger space requirements | Smaller space requirements |
| Slower data access | Faster data access |
| Do not order data | Order data |
IOT Rules:
- An IOT can have additional indexes defined on other columns in the table
- An IOT without secondary indexes is stored as a single segment
- The space requirements for an IOT is reduced because the data is stored in the same segment as the index and therefore no physical ROID is required.
- An IOT can be rebuilt without locking the table
- IOT must be created with a primary key, non deferrable, constraint, this is used to create the index that holds the table
- An IOT cannot be a clustered table
- Composite partitioning cannot be applied to an IOT
- Columns of type long cannot be defined in an IOT
- Access via a primary key is fast but secondary keys can be slow.
| Creating | create table employee_new ( employee_id number, dept_id number, name varchar2(30), address varchar2(120), constraint pk_employee_new primary key (employee_id)) organization index tablespace empindex_01 pctthreshold 25 overflow tablespace overflow_tbs; Note: the organization index states it's a IOT.pctthreshold states the percentage of space reserved in the index blocks, any part of the row that does not fit the 25 percent threshold value in each data block is saved in an overflow area. |
| Removing | drop table employee_new; |
| Index info | select table_name, tablespace_name, iot_name, iot_type from user_tables; select index_name, tablespace_name, index_type, table_name from user_indexes; select segment_name, tablespace_name, segment_type from user_segments; |
| Secondary Bitmap Index | alter table emp_iot move mapping table; select segment_name, tablespace_name, segment_type from user_segments; (should see one called map) create bitmap index emp_dept_idx on emp_iot(emp_dept); |
Oracle allows the use of external tables that is tables that reside in external operating system files. The definition of an external table is created in the data dictionary which allows you to load data into other oracle tables, so no space is actually used by the external table. If you drop the external table you are actually only removing the data definition in the dictionary. see external tables for more details.
Nested Tables
A nested table is one of two collection types in Oracle, it is very similar to a child table in a traditional parent/child table pair. It give the illusion that each row in the parent table has its own child table i.e. if there are 100 rows in the parent table there are virtually 100 nested tables. There are two ways to use a nested table one is in PL/SQL code as a way to extend the PL/SQL language and the other is a physical storage mechanism for persistent storage of collections. They are very rarely used as a storage mechanism due to the following reasons:
- The overhead of the RAW columns that are added. The parent table will have an extra 16 byte RAW for each nested table.
- The overhead of the unique constraint on the parent table, when it already typically has a unique constraint.
- The nested table is not easily used by itself without unsupported constructs. It can be un-nested for queries but not mass updates.
Oracle tables can be quite large, partitioning basically means dividing the table up into smaller chunks. All the partitions share the same logical definition, column definition and constraints. Performance is greatly improved as you only search the relevant partitions of the table during a query. Partitions can be kept on different disks to further increase performance. You can backup, index, load data partitions independently of each other. There are five ways to partition data
| range partitioning | used for data that can be separated into ranges based on some criterion i.e. date, part number, sequence. |
| hash partitioning | if data in a range partition is not evenly distributed (lots of data for one particular date) then performance decreases, hash partitioning uses oracle hashing algorithms to assign a hash value to each rows partitioning key and place it in the appropriate partition. Hopefully the data will be evenly distributed across the partitions. |
| list partitioning | used for data that can be separated into lists based on some criterion i.e. city, territory. Again partition can be unevenly distributed across the partitions. |
| composite range-hash partitioning | basically a combination of range and hash partition making sure that the data is evenly distributed across the partitions |
| composite range-list partitioning | basically a combination of range and list. First partition on a range of values then break up the first set of partitions using a list of discrete values. |
Examples
|
|
| range partitioning | create table sales_data ( ticket_no number, sale_year int not null, sale_month int not null, sale_day int not null) partition by range (sale_year, sale_month, sale_day) (partition sales_q1 values less than (2007, 04, 01) tablespace ts1, partition sales_q2 values less than (2007, 07, 01) tablespace ts2, partition sales_q3 values less than (2007, 10, 01) tablespace ts3, partition sales_q4 values less than (2008, 01, 01) tablespace ts4); Note: ideally each of the tablespaces should be on its own disk for increased performance |
| Hash partitioning | create table sales_data ( ticket_no number, sale_year int not null, sale_month int not null, sale_day int not null) partition by hash (ticket_no) partitions 4 store in (ts1, ts2, ts3, t4); Note: you have no control on where the data is put |
| List Partitioning | create table sales_data ( ticket_no number, sale_year int not null, sale_month int not null, sale_day int not null, destination_city char(3), start_city char(3), partition by list (start_city) (partition north_sales values ('newcastle', 'sunderland', 'leeds') tablespace ts1, partition south_sales values ('brighton', 'bournemouth', 'cornwall') tablespace ts2, partition east_sales values ('norwich', 'ipswitch', 'yarmouth') tablespace ts3, partition west_sales values ('birmingham', 'cardiff', 'coventry') tablespace ts4); |
| composite range-hash partitioning | create table football_kit (equipno number, equipname varchar(32), price number) partition by range (equipno) subpartition by hash(equipname) subpartitions 8 store in (ts1, ts2, ts3, ts4) (partition p1 values less than (1000), partition p2 values less than (2000), partition p3 values less than (3000), partition p4 values less than (MAXVALUE)); |
| composite range-list partitioning | create table quarterly_sales_data ( ticket_no number, sale_year int not null, sale_month int not null, sale_day int not null, destination_city char(3), start_city char(3), partition by range (sale_day) subpartition by list (start_city) (partition q1_2007 values les than (to_date('1-apr-2007','dd-mon-yyyy')) tablespace ts1 (subpartition q12007_north_sales values ('newcastle', 'sunderland', 'leeds'), subpartition q12007_south_sales values ('brighton', 'bournemouth', 'cornwall'), subpartition q12007_east_sales values ('norwich', 'ipswitch', 'yarmouth'), subpartition q12007_west_sales values ('birmingham', 'cardiff', 'coventry') ), partition q2_2007 values les than (to_date('1-jul-2007','dd-mon-yyyy')) tablespace ts1 (subpartition q22007_north_sales values ('newcastle', 'sunderland', 'leeds'), subpartition q22007_south_sales values ('brighton', 'bournemouth', 'cornwall'), subpartition q22007_east_sales values ('norwich', 'ipswitch', 'yarmouth'), subpartition q22007_west_sales values ('birmingham', 'cardiff', 'coventry') ), partition q3_2007 values les than (to_date('1-oct-2007','dd-mon-yyyy')) tablespace ts1 (subpartition q32007_north_sales values ('newcastle', 'sunderland', 'leeds'), subpartition q32007_south_sales values ('brighton', 'bournemouth', 'cornwall'), subpartition q32007_east_sales values ('norwich', 'ipswitch', 'yarmouth'), subpartition q32007_west_sales values ('birmingham', 'cardiff', 'coventry') ), partition q4_2007 values les than (to_date('1-jan-2008','dd-mon-yyyy')) tablespace ts1 (subpartition q42007_north_sales values ('newcastle', 'sunderland', 'leeds'), subpartition q42007_south_sales values ('brighton', 'bournemouth', 'cornwall'), subpartition q42007_east_sales values ('norwich', 'ipswitch', 'yarmouth'), subpartition q42007_west_sales values ('birmingham', 'cardiff', 'coventry') ) ); |
Partition Maintenance
|
|
| Adding partitions | alter table sales_data add partition sales_quarters values less than (to_date('1-apr-2007','dd-mon-yyyy')) tablespace ts4; |
| Splitting partitions | alter vtable sales_data split partition ticket_sales01 at (2000) into (partition ticket_sales01A, ticket_sales01B); |
| Merging partitions | alter table ticket_sales merge partitions ticket_sales01A, ticket_sales01B into partition ticket_sales02; |
| Renaming partitions | alter table rename partition ticket_sales01B to ticket_sales01C; |
| Exchanging partitions | alter table ticket_sales exchange partition ticket_sales02 with ticket_sales03; Note : This enables you to convert a regular nonpartitioned table into a partition of a partitioned table. |
| Dropping partitions | Alter table sales_data drop partition sales_quarters; Note: if you have data in the partitions you intend to drop, you need to use the 'update global indexes' clause with the preceding statement, Otherwise all globally created indexes will become invalidated. Local indexes will still be 0kay because they're mapped directly to the affected partitions only. |
| Coalescing partitions | alter table sales_data coalsce partition; Note: you coalsce hash and list partitioned tables |
A cluster is a way to store a group of tables that share some common columns in the same database blocks and to store related data together on the same block, the goal is to reduce disk I/O and therefore increase access speed when you join two tables together. You should not use a clustered table when:
- Clusters may impact the performance of DML - if the tables are heavy modified an index cluster will have certain negative performance side effects.
- Full scans of a table in clusters are affected, as all the tables in the cluster will be scanned, full scans will take longer
- If you believe you frequently need to truncate and load the table, Tables in clusters cannot be truncated since the cluster stores more then one table on a block, we must delete the rows in a cluster table.
| Permissions | grant create cluster to test02; |
| Creating | # Create the cluster create cluster emp_dept( deptno number (3)) tablespace users; # Create the two tables that are part of the cluster create table dept ( deptno number (3) primary key) cluster emp_dept (deptno); create table emp ( empno number (5) primary key, ename varchar2(15), deptno number (3) references dept) cluster emp_dept (deptno); |
| Removing | drop table emp_dept; |
| Cluster Information | select table_name, tablespace_name, cluster_name, from user_tables; (only see tables, not cluster) select segment_name, tablespace_name, segment_type from user_segments; (only see the cluster, not tables) |
Hash cluster tables are similar to Index cluster tables with the exception that a cluster index is not used. The data is the index in this case. The cluster key is hashed into a block address and the data is expected to be there. The important things to really understand are:
- The hash cluster is allocated right from the beginning. Oracle will take your HASHKEYS/trunc(blocksize/SIZE) and will allocate that space right away, as soon as the first table is put in that cluster, any full full scan will hit every allocated block. This is different from every other table.
- The number of HASHKEYS in a hash cluster is fixed, performance can be affected due to unintended hash collisions if it was set to low.
- Range scanning on the cluster key is not available.The table will be full scanned if you use a range on a cluster key.
- You know how many rows the table will have over its life, or a reasonable upper bound. If at all possible you must not do a rebuild of the table.
- DML, especially inserts is light. Updates do not introduce overhead, unless you update the HASHKEY, which would not be a good idea, that would cause the row to migrate.
- You access the data by the HASHKEY value constantly. For example you have a table of parts, and part numbers accesses these parts. Lookup tables are appropriate for hash clusters
| Creating | # Create the hash cluster create cluster emp_dept(deptno number (3)) tablespace users hash is deptno hashkeys 200; # Create the two tables that are part of the cluster create table dept ( deptno number (3) primary key) cluster emp_dept (deptno); create table emp ( empno number (5) primary key, ename varchar2(15), deptno number (3) references dept) cluster emp_dept (deptno); |
| Removing | drop table emp_dept; |
You can place comments on tables and columns within a table, this is helpful to anyone who will be supporting the database after the originator has left. Also note that you can also put comments on views.
| Creating table comment | comment on table emp is 'This is the company empolyee table'; |
| Removing table comment | comment on table emp is ''; |
| Displaying table comments | select table_name, comment from user_tab_comments; |
| Create table column comment | comment on column emp.nat_i is 'National Insurance column'; |
| Removing table column comment | comment on column emp.nat_i is ''; |
| Display column comments | select table_name, column_name, comments from user_col_comments where table_name = 'EMP'; |
Oracle offers online table redefinition, which lets you redefine objects like tables while users are still reading and writing to them. You can perform the following
- add, drop and rename columns
- transform table data
- change column data types
- rename table constraints
- change original storage parameters
- reduce fragmentation
- create a partitioned table from a regular table
- create a IOT from a regular table
- move a table to a different tablespace.
| primary key | Select a primary key or pseudo-primary key to use
for the redefinition. Pseudo-primary keys are unique keys with all
component columns having NOT NULL constraints. For this
method, the versions of the tables before and after redefinition should
have the same primary key columns. This is the preferred and default
method of redefinition. |
| rowid | Use this method if no key is available. In this method, a hidden column named M_ROW$$
is added to the post-redefined version of the table. It is recommended
that this column be dropped or marked as unused after the redefinition
is complete. You cannot use this method on index-organized tables. |
- Determine if a table is a good candidate for redefinition
- Decide the new structure for the table, and create a new image of the table
- Start the redefinition process by using the dbms_redefinition package
- Create the necessary constraints and triggers on the new table
- Perform periodic synchronization and validation of data in the new table
- Complete the redefinition of the table
- Cleanup
| Original table | employee_id not null number(6) first_name varchar2(20) last_name not null varchar2(20) email not null varchar2(25) phone_number varchar2(20) hire_date not null date job_id not null varchar2(20) salary number(8,2) commission_pct number(2,2) manager_id number(6) department_id number(4) |
| step 1 | -- Default option which uses the primary key option begin dbms_redefinition.can_redef_table('vallep','employees', dbms_redefinition.cons_use_pk); end; / Note: it is optional to use the cons_use_pk as this is the default method -- Using the more complex rowid option begin dbms_redefinition.can_redef_table('vallep','employees', dbms_redefinition.cons_use_rowid); end; / |
| step 2 | -- create the new table adding different storage parameters, partition, etc create table vallep.employees_temp ( employee_id number(6), first_name varchar2(20) not null, last_name varchar2(20) not null, email varchar2(25) not null, phone_number varchar2(20), hire_date date not null, job_id varchar2(20) not null, commission_pct number(2,2), manager_id number(6), department_id number(4)) partition by range(employee_id) (partition employees1 values less than (100) tablespace EMP01, partition employess2 values less than (300) tablespace EMP02); |
| step 3 | -- start the redefining process begin dbms_redefinition.start_redef_table('vallep','emmployees', 'employees_temp', 'employee_id employee_id, first_name first_name, last_name last_name, email email, phone_number phone_number, hire_date hire_date, job_id job_id, commission_pct comission_pct, manager_id manager_id, department_id department_id'); end; / -- check the two tables select count(*) from employees; select count(*) from employees_temp; |
| step 4 | -- now copy the constraint, triggers, etc declare num_errors pls_integer begin dbms_redefinition.copy_table dependents('vallep','employees','employees_temp', dbms_redefinition.cons_orig_params, true, true, true, true, num_errors); Note: during execution two tables are created a temporary and a permanent, the temporary one called rupd$_employee and last only for the session, the permanent one holds a snapshot of all the changes made to the original employees table it a bit like a materialized view log. |
| Step 5 | -- now need to check for any errors select object_name, base_table_name, ddl_txt from dba_redefinition_errors; -- now synchronize the data in the interim and source tables exec dbms_redefinition.sync_interim_table('vallep','employees','employees_temp'); |
| Step 6 | -- complete the redefinition exe dbms_redefinition.finish_redef_table('vallep','employees','employees_temp'); Note: the following happens oracle reams the materialized log and updates the interim table employees table is redefined so it has the same attributes as the interim table referential constraints/triggers involving the employees_temp table are enabled the two tables are briefly locked in execlusive mode to make the necessary changes to the data dictionary the materialized view and logs are dropped |
| Step 7 | -- once you are happy with the employees table change you can drop the temp table drop table employees_temp; |
| If all hell break out | -- if you obtains any errors during step 5 or you just want to abort run the command dbms_redefinition.abort_redef_table('vallep','employees','employees_temp'); |
Useful Views
Useful Views
|
|
| DBA_TABLES | describes all relational tables in the database |
| DBA_TAB_COLUMNS | describes the columns of all tables, views, and clusters in the database |
| DBA_UNUSED_COL_TABS | list all tables that have columns marked as unused. |
| DBA_TAB_COMMENTS | displays comments on all tables and views in the database |
| DBA_COL_COMMENTS | displays comments on all tables and views in the database |
| DBA_TAB_PARTITIONS | provides the partition-level partitioning information, partition storage parameters, and partition statistics determined by ANALYZE statements for all partitions in the database |
You can dump the contents of a table block, see dump a block of data
Oracle tables: In another way.
Being able to classify a table helps in two common situations: at design time when designing a schema/database, and in cases where you inherit someone else's work, that is, when you need to understand what is already in place. You may already have an intuitive understanding of how tables are used, but the ability to specifically identify a table's purpose or function is useful in that it allows you to communicate requirements to others, whether they are database administrators or database developers. Let's start off by considering a data workflow consisting of in, out, and record.Stage tables
For many applications with a database back end, stage tables are the first point of entry (or first line of defense, depending on your perspective) for data entering a database. In their simplest form, stage tables are images of the contents of a data file. In terms of purpose, stage tables serve as intake, upload, or collector mechanisms. A very common method of populating stage tables is by using SQL*Loader. Stage table loading can also be used to perform data hygiene. Create a unique index on a particular column, and the upload process takes care of removing duplicates. Other bad or discarded line items can be collected and reported back to the data provider. Depending on your needs, you can also use stage tables to begin or perform transformation processes.From a use perspective, you will probably want to reuse stage tables. The application or database - it doesn’t matter which one, just as long as it gets done - should perform a clean-up or preparation step prior to uploading data. Is it done by using REPLACE in SQL*Loader or by a prepared statement in Java which uses TRUNCATE TABLE? From a design perspective, does the table need to have logging enabled? If so, why? Are you ever going to rollback or recover anything in a stage table? Probably not, so keep the DDL type of features or options to a minimum. Now that the data is inside that database, where does it go next?
Operational tables
One way to define tables of this type is to consider what they contain. Live data, such as customer or account information, can be stored in operational tables. These tables are the guts of the database, so to speak. In one form or another, everything in an application revolves around the contents of these tables. Operational tables should be first and foremost with respect to security, backup, and recovery, and therefore, should employ whatever means or features to meet these requirements. This includes logging, enabled row movement for flashback, explicit permissions, auditing, fine grained access, virtual private database, and whatever else your situation/application may require.These tables are further characterized by how they are designed. Normalization is a word or process that should immediately come to mind, and along with this comes the consideration of related objects such as indexes, views, sequences, and triggers, to name a few. These tables are what I referred to as "normal" tables in the introduction and normal seems to be a good fit for two reasons. First is that these tables are normally seen throughout a schema, and second is that these tables typically need to be normalized.
Another type of operational table is one related to how an application or schema operates. An example would be storing information about disposition tables (see below). Another example is familiar for those who design security (roles and permissions) into a schema as opposed to having Oracle manage this function.
Functional tables
Lookup or intersection tables fall into this category as their main purpose is to perform a function (i.e., resolve a many-to-many relationship). Another type of functional table is a shopping cart/order entry table (contains an order number and one or more line items). These tables are relatively easy to identify and their purpose is discussed in virtually all database design books. Functional tables bear mentioning because of their close association with operational tables; however, not all tables are created equal.Identify the crucial functional tables and place them into the same category with respect to the importance placed on operational tables. How can you distinguish between them? One way is to consider how the contents change, that is, is the data static or not? Static tables (e.g., job code versus job description) can be repopulated from a script. Dynamic tables require recovery. Do you directly populate these tables or is it done indirectly (via a trigger)?
Disposition tables
Tables of this type are frequently created by performing a join between a stage table and a combination of one or more operational and functional tables. Disposition tables are commonly used for reporting or staging data to be spooled out into disposition files. You may find many tables of this type in a schema, all with the same definition. What differentiates them is the date they were created or the number or records (weekly report or newsletter recipient list consisting of fewer rows as compared to an end of month report or recipient list).A disposition table can also be similar to a stage table in that its contents are routinely flushed and repopulated for a pending disposition or report process. An alternative means of keeping disposition data is to borrow from the data warehouse side of things and use a flight history type of table. You will not need to retain 100% of the disposition table's contents, but you will need to capture what is essential. Who sent what, and when, and can details be reconstructed if necessary? If details are lost due to data changes elsewhere in the database, then the record keeping requirements become more complicated. It then becomes a trade off between keeping scores of disposition tables and maintaining a huge flight history type of table.
0 Comments