ModuleControl
, ModuleSupportable
, MethodFactory
, SortFactory
, SortCostController
UniqueWithDuplicateNullsExternalSortFactory
public class ExternalSortFactory extends java.lang.Object implements SortFactory, ModuleControl, ModuleSupportable, SortCostController
Modifier and Type | Field | Description |
---|---|---|
protected static int |
DEFAULT_MAX_MERGE_RUN |
|
protected static int |
DEFAULT_MEM_USE |
|
private static int |
DEFAULT_SORTBUFFERMAX |
|
private int |
defaultSortBufferMax |
|
private UUID |
formatUUID |
|
private static java.lang.String |
FORMATUUIDSTRING |
|
private static java.lang.String |
IMPLEMENTATIONID |
|
private static int |
MINIMUM_SORTBUFFERMAX |
|
private static int |
SORT_ROW_OVERHEAD |
|
private int |
sortBufferMax |
|
private boolean |
userSpecified |
MODULE
Constructor | Description |
---|---|
ExternalSortFactory() |
Modifier and Type | Method | Description |
---|---|---|
void |
boot(boolean create,
java.util.Properties startParams) |
Boot this module with the given properties.
|
boolean |
canSupport(java.util.Properties startParams) |
See if this implementation can support any attributes that are listed in properties.
|
void |
close() |
Close the controller.
|
Sort |
createSort(TransactionController tran,
int segment,
java.util.Properties implParameters,
DataValueDescriptor[] template,
ColumnOrdering[] columnOrdering,
SortObserver sortObserver,
boolean alreadyInOrder,
long estimatedRows,
int estimatedRowSize) |
Create a sort.
|
java.util.Properties |
defaultProperties() |
There are no default properties for the external sort..
|
protected MergeSort |
getMergeSort() |
Returns merge sort implementation.
|
private static ModuleFactory |
getMonitor() |
Privileged Monitor lookup.
|
double |
getSortCost(DataValueDescriptor[] template,
ColumnOrdering[] columnOrdering,
boolean alreadyInOrder,
long estimatedInputRows,
long estimatedExportRows,
int estimatedRowSize) |
Short one line description of routine.
|
SortCostController |
openSortCostController() |
Return an open SortCostController.
|
UUID |
primaryFormat() |
Return the primary format that this access method supports.
|
java.lang.String |
primaryImplementationType() |
Return the primary implementation type for this access method.
|
void |
stop() |
Stop the module.
|
boolean |
supportsFormat(UUID formatid) |
Return whether this access method supports the format supplied in
the argument.
|
boolean |
supportsImplementation(java.lang.String implementationId) |
Return whether this access method implements the implementation
type given in the argument string.
|
private boolean userSpecified
private int defaultSortBufferMax
private int sortBufferMax
private static final java.lang.String IMPLEMENTATIONID
private static final java.lang.String FORMATUUIDSTRING
private UUID formatUUID
private static final int DEFAULT_SORTBUFFERMAX
private static final int MINIMUM_SORTBUFFERMAX
protected static final int DEFAULT_MEM_USE
protected static final int DEFAULT_MAX_MERGE_RUN
private static final int SORT_ROW_OVERHEAD
public java.util.Properties defaultProperties()
defaultProperties
in interface MethodFactory
MethodFactory.defaultProperties()
public boolean supportsImplementation(java.lang.String implementationId)
MethodFactory
supportsImplementation
in interface MethodFactory
MethodFactory.supportsImplementation(java.lang.String)
public java.lang.String primaryImplementationType()
MethodFactory
primaryImplementationType
in interface MethodFactory
MethodFactory.primaryImplementationType()
public boolean supportsFormat(UUID formatid)
MethodFactory
supportsFormat
in interface MethodFactory
MethodFactory.supportsFormat(org.apache.derby.catalog.UUID)
public UUID primaryFormat()
MethodFactory
primaryFormat
in interface MethodFactory
MethodFactory.primaryFormat()
protected MergeSort getMergeSort()
public Sort createSort(TransactionController tran, int segment, java.util.Properties implParameters, DataValueDescriptor[] template, ColumnOrdering[] columnOrdering, SortObserver sortObserver, boolean alreadyInOrder, long estimatedRows, int estimatedRowSize) throws StandardException
createSort
in interface SortFactory
StandardException
- if the sort could not be
opened for some reason, or if an error occurred in one of
the lower level modules.SortFactory.createSort(org.apache.derby.iapi.store.access.TransactionController, int, java.util.Properties, org.apache.derby.iapi.types.DataValueDescriptor[], org.apache.derby.iapi.store.access.ColumnOrdering[], org.apache.derby.iapi.store.access.SortObserver, boolean, long, int)
public SortCostController openSortCostController() throws StandardException
Return an open SortCostController which can be used to ask about the estimated costs of SortController() operations.
openSortCostController
in interface SortFactory
StandardException
- Standard exception policy.SortCostController
public void close()
SortCostController
Close the open controller. This method always succeeds, and never throws any exceptions. Callers must not use the StoreCostController after closing it; they are strongly advised to clear out the StoreCostController reference after closing.
close
in interface SortCostController
public double getSortCost(DataValueDescriptor[] template, ColumnOrdering[] columnOrdering, boolean alreadyInOrder, long estimatedInputRows, long estimatedExportRows, int estimatedRowSize) throws StandardException
The sort algorithm is a N * log(N) algorithm. The following numbers on a PII, 400 MHZ machine, jdk117 with jit, insane.zip. This test is a simple "select * from table order by first_int_column. I then subtracted the time it takes to do "select * from table" from the result. number of rows elaspsed time in seconds -------------- ----------------------------- 1000 0.20 10000 10.5 100000 80.0 We assume that the formula for sort performance is of the form: performance = K * N * log(N). Solving the equation for the 1000 and 100000 case we come up with: performance = 1 + 0.08 N ln(n) NOTE: Apparently, these measurements were done on a faster machine than was used for other performance measurements used by the optimizer. Experiments show that the 0.8 multiplier is off by a factor of 4 with respect to other measurements (such as the time it takes to scan a conglomerate). I am correcting the formula to use 0.32 rather than 0.08. - Jeff
RESOLVE (mikem) - this formula is very crude at the moment and will be refined later. known problems: 1) internal vs. external sort - we know that the performance of sort is discontinuous when we go from an internal to an external sort. A better model is probably a different set of contants for internal vs. external sort and some way to guess when this is going to happen. 2) current row size is never considered but is critical to performance. 3) estimatedExportRows is not used. This is a critical number to know if an internal vs. an external sort will happen.
getSortCost
in interface SortCostController
template
- A row which is prototypical for the sort. All
rows inserted into the sort controller must have
exactly the same number of columns as the
template row. Every column in an inserted row
must have the same type as the corresponding
column in the template.columnOrdering
- An array which specifies which columns
participate in ordering - see interface
ColumnOrdering for details. The column
referenced in the 0th columnOrdering object is
compared first, then the 1st, etc.alreadyInOrder
- Indicates that the rows inserted into the sort
controller will already be in order. This is used
to perform aggregation only.estimatedInputRows
- The number of rows that the caller estimates
will be inserted into the sort. This number must
be >= 0.estimatedExportRows
- The number of rows that the caller estimates
will be exported by the sorter. For instance if
the sort is doing duplicate elimination and all
rows are expected to be duplicates then the
estimatedExportRows would be 1. If no duplicate
eliminate is to be done then estimatedExportRows
would be the same as estimatedInputRows. This
number must be >= 0.estimatedRowSize
- The estimated average row size of the rows
being sorted. This is the client portion of the
rowsize, it should not attempt to calculate
Store's overhead. -1 indicates that the caller
has no idea (and the sorter will use 100 bytes
in that case. Used by the sort to make good
choices about in-memory vs. external sorting,
and to size merge runs. The client is not
expected to estimate the per column/ per row
overhead of raw store, just to make a guess
about the storage associated with each row
(ie. reasonable estimates for some
implementations would be 4 for int, 8 for long,
102 for char(100), 202 for varchar(200), a
number out of hat for user types, ...).StandardException
- Standard exception policy.public boolean canSupport(java.util.Properties startParams)
ModuleSupportable
The module can check for attributes in the properties to
see if it can fulfill the required behaviour. E.g. the raw
store may define an attribute called RawStore.Recoverable.
If a temporary raw store is required the property RawStore.recoverable=false
would be added to the properties before calling bootServiceModule. If a
raw store cannot support this attribute its canSupport method would
return null. Also see the Monitor class's prologue to see how the
identifier is used in looking up properties.
Actually a better way maybe to have properties of the form
RawStore.Attributes.mandatory=recoverable,smallfootprint and
RawStore.Attributes.requested=oltp,fast
canSupport
in interface ModuleSupportable
public void boot(boolean create, java.util.Properties startParams) throws StandardException
ModuleControl
An implementation's boot method can throw StandardException. If it is thrown the module is not registered by the monitor and therefore cannot be found through a findModule(). In this case the module's stop() method is not called, thus throwing this exception must free up any resources.
When create is true the contents of the properties object
will be written to the service.properties of the persistent
service. Thus any code that requires an entry in service.properties
must explicitly place the value in this properties set
using the put method.
Typically the properties object contains one or more default
properties sets, which are not written out to service.properties.
These default sets are how callers modify the create process. In a
JDBC connection database create the first set of defaults is a properties
object that contains the attributes that were set on the jdbc:derby: URL.
This attributes properties set has the second default properties set as
its default. This set (which could be null) contains the properties
that the user set on their DriverManager.getConnection() call, and are thus
not owned by Derby code, and thus must not be modified by Derby
code.
When create is false the properties object contains all the properties set in the service.properties file plus a limited number of attributes from the JDBC URL attributes or connection properties set. This avoids properties set by the user compromising the boot process. An example of a property passed in from the JDBC world is the bootPassword for encrypted databases.
Code should not hold onto the passed in properties reference after boot time as its contents may change underneath it. At least after the complete boot is completed, the links to all the default sets will be removed.
boot
in interface ModuleControl
StandardException
- Module cannot be started.Monitor
,
ModuleFactory
public void stop()
ModuleControl
stop
in interface ModuleControl
Monitor
,
ModuleFactory
private static ModuleFactory getMonitor()
Apache Derby V10.14 Internals - Copyright © 2004,2018 The Apache Software Foundation. All Rights Reserved.