ML12088A236: Difference between revisions
StriderTol (talk | contribs) (Created page by program invented by StriderTol) |
StriderTol (talk | contribs) (Created page by program invented by StriderTol) |
||
Line 15: | Line 15: | ||
| page count = 202 | | page count = 202 | ||
}} | }} | ||
=Text= | |||
{{#Wiki_filter:ENT000017 Submitted: March 28, 2012 ThistechnicaldocumentwaspreparedunderthedirectionofDonnaKostanich,AssistantDivisionChiefforSamplingand Estimation,DecennialStatisticalStudies Division.Theoverallmanagementand coordinationofthereviewwasconducted | |||
byDawnHaines andDouglasOlson | |||
.ThecombinedeffortsofnumerousU.S.CensusBureaustaffhaveculminatedin thepublicationofthisdocument.Some staffmemberswrotechapters,while othersreviewedchapters.Insomecases,staffmembersfilledbothcapacities.ContributingtotheMarch2001portionofAccuracyandCoverageEvaluationof Census2000:DesignandMethodology werePatrickCantwell,InezChen,DannyChilders,PeterDavis,JamesFarber,DeborahFenstermaker, RichardGriffin,DawnHaines, HowardHogan,MichaelIkeda, DonnaKostanich,VincentThomasMule,MaryMulry,AlfredoNavarro,DouglasOlson,J.Gregory Robinson,RobertSands, and MichaelStarsinic.JosephWaksberg,ofWestat,Inc.,reviewedthesechaptersforreadabil-ityandconsistency.ContributingtotheA.C.E.RevisionIIsec-tionofAccuracyandCoverageEvaluation ofCensus2000:DesignandMethodology wereTamaraAdams,MichaelBeaghen,WilliamBell,PatrickCantwell,DeborahFenstermaker,Richard Griffin,DawnHaines,MichaelIkeda,DonnaKostanich,ElizabethKrejsa,VincentThomasMule,MaryMulry, RitaPetroni,RobertSands,Eric Schindler,BruceSpencer,ofNorthwest-ernUniversity,andDavidWhitford.RhondaGeddingsprovidedadministra-tivesupport.BernadetteBeasley,MeshelButler,HelenCurtis,SusanKelly, and Kim OttensteinoftheAdministrativeandCustomerServicesDivision,Walter Odom,Chief,providedpublicationsandprintingmanagement,graphicsdesign, andcompositionandeditorialreviewforprintandelectronicmedia.Generaldirectionandproductionmanagement wereprovidedbyJamesClark, AssistantDivisionChief.MargaretSmithofACSDprovidedassis-tanceinplacingtheelectronicversionof thisdocumentontheInternet(see www.census.gov/dmd/www/refroom.html).Wearegratefulfortheassistanceoftheindividualslistedandallotherswho contributedbutarenotspecificallymentioned.Thepreparationandpublica-tionofthisdocumentwaspossible becauseoftheirinvaluablecontributions. | |||
ACKNOWLEDGMENTS | |||
Vacant,PrincipalAssociateDirector andChiefFinancialOfficerVacant,PrincipalAssociate DirectorforProgramsPrestonJayWaite,AssociateDirector forDecennialCensusNancyM.Gordon,AssociateDirector forDemographicProgramsSUGGESTEDCITATIONFILES:Census2000,AccuracyandCoverageEvaluationofCensus2000:DesignandMethodologyU.S.CensusBureau,2004CynthiaZ.F.Clark,AssociateDirector forMethodologyand StandardsMarvinD.Raines,AssociateDirector forFieldOperationsArnoldA.Jackson,AssistantDirector forDecennialCensus ECONOMICSAND STATISTICSADMINISTRATION EconomicsandStatistics AdministrationKathleenB.Cooper,UnderSecretaryforEconomicAffairsU.S.CENSUSBUREAUCharlesLouisKincannon,DirectorHermannHabermann,DeputyDirectorandChiefOperatingOfficer ForewordTheU.S.CensusBureauconductedtheAccuracyandCoverageEvaluation(A.C.E.)surveytomeasurethecoverageofthepopulationinCensus2000.TheA.C.E.wasdesignedtoservetwopurposes:(1)tomeasurethe netcoverageofthepopulation,bothintotalandformajorsubgroups,and(2)toprovidedatathatcouldserveasthebasisforcorrectingthecensuscountsforsuchusesas Congressionalredistricting,stateandlocalredistricting,fundsallocationandgovernmentalprogramadministra-tion.TheA.C.E.surveyprovidescriticalinformationthat canbeusedtoimprovethecensus-takingprocess. | |||
However,thedesign,methodology,operationsanddatacollectioneffortsareextremelycomplexandnotwidelyunderstood.Theworkdescribedinthispublicationwasa majorundertaking,andthetechnicaldocumentationisintendedtoincreaseawarenessandknowledge,andsub-sequentlyimprovethe2010Censusandcoveragemea-surementtechniques.DespitethefactthatcoveragemeasurementtechniqueshavebeenutilizedbytheCensusBureauforseveral decades,thisisthefirstcomprehensivedocumentationofitskind.Thistechnicaldocumentdescribesthemethod-ologiesthatwereusedtoproduceestimatesofCensus 2000coverageerrorfromtheA.C.E.Thefirstpartofthis documentdiscussestheentiresurveydesignusedtoproducetheoriginalestimatesofnetundercountreleasedinMarch2001.Analysisandevaluationsindicatedthat therewereseriouserrorsintheMarch2001A.C.E. | |||
Researcheffortstofixthedetectederrorsresultedin improvedcoverageestimatesreferredtoasA.C.E.Revi-sionII.ThesecondpartofthisdocumentdescribesthemethodologyusedtocorrectforerrorsintheMarch2001 A.C.E.Afterextensiveanalysisandconsideration,theCensusBureauultimatelydecidednottousetheA.C.E.-neither theMarch2001northeRevisionIIresults-tocorrectthe Census2000countsoranyotherdataproducts.A.C.E. | |||
RevisionII,thesuperiorofthetworesults,providesusefulcoveragemeasurementinformationthatcanbeusedforresearchpurposes.Alloftheseresults,decisions,support-inganalyses,technicalassessments,andlimitationscan befoundontheCensusBureausWebsiteat www.census.gov/dmd/www/EscapRep.html.Thisdocumentisintendedtopromoteknowledgeandencouragecollaborationoncoveragemeasurementissues.Assuch,wewelcomecommentsandsuggestionsfromcolleaguesontechnicalissuesandalsoonthevalueof thisdocument.CharlesLouisKincannonDirector,U.S.CensusBureauU.S.CensusBureau SectionI:A.C.E.March2001 Chapters1.IntroductiontotheA.C.E. | |||
........................ | |||
1-12.AccuracyandCoverageEvaluationOverview | |||
............. | |||
2-13.DesignoftheA.C.E.Sample | |||
....................... | |||
3-14.A.C.E.FieldandProcessingActivities | |||
.................. | |||
4-15.TargetedExtendedSearch | |||
........................ | |||
5-16.MissingDataProcedures | |||
........................ | |||
6-17.DualSystemEstimation | |||
......................... | |||
7-18.Model-BasedEstimationforSmallAreas | |||
................ | |||
8-1 AppendixesA.Census2000MissingData | |||
....................... | |||
A-1B.DemographicAnalysis | |||
.......................... | |||
B-1C.WeightTrimming | |||
............................ | |||
C-1D.ErrorProfileforA.C.E.Estimates | |||
.................... | |||
D-1SectionIReferences | |||
............................... | |||
1SectionII:A.C.E.RevisionIIMarch2003 Chapters1.IntroductiontoA.C.E.RevisionII | |||
.................... | |||
1-12.SummaryofA.C.E.RevisionIIMethodology | |||
.............. | |||
2-13.CorrectingDataforMeasurementError | |||
................. | |||
3-14.A.C.E.RevisionIIMissingDataMethods | |||
................ | |||
4-15.FurtherStudyofPersonDuplicationinCensus2000 | |||
.........5-16.A.C.E.RevisionIIEstimation | |||
...................... | |||
6-17.AssessingtheEstimates | |||
......................... | |||
7-1SectionIIReferences | |||
.............................. | |||
1 CONTENTS iv AccuracyandCoverageEvaluationofCensus2000:DesignandMethodologySectionIA.C.E.March2001U.S.CensusBureau,Census2000 Chapter1.IntroductiontotheA.C.E. | |||
INTRODUCTIONTheU.S.CensusBureauconductedtheAccuracyandCov-erageEvaluation(A.C.E.)tomeasurethecoverageofthe populationinCensus2000andtoallowforthepossibilityofcorrectingthecensusresultsforthemeasuredunder-count.Italsoprovidesawealthofinformationonthe censusprocessandmay,thus,enableimprovementinfuturecensuses.Thisdocumentiswrittentoprovideaclearandpermanentrecordofthemethodsandopera-tionsusedinthisproject.ThecurrentchapterpresentstheobjectivesandscopeoftheA.C.E.,anddiscusseslimitationsofwhatitwasattemptingtoaccomplish.Itincludesabriefhistoryoftheevolutionofthestatisticalandoperationalmethodsupon whichtheA.C.E.isbased.Chapter2presentsanoverview ofthevariousstatisticalstepsnecessarytoproduceesti-matesofcensuscoverageandhowtheyaretiedintotheoperationofthesurvey.Thesequenceofmajoractivities andtheirtimingisgiven.SubsequentchaptersdiscussindetailA.C.E.sampling,interviewing,processing,andestimationsteps. | |||
GoalsTheevaluationofthecompletenessofcensusenumera-tionhasbeenanintegralpartofthedecennialcensussincethe1950census.Thisevaluationhastakenonmanyformsincludingdemographicanalysis,administrative recordchecks,matchestoindependentsurveys,and dependentrecordrechecksandreinterviews.Theevaluationofthefivecensusesfrom1950to1990clearlyshowedthateachofthetraditionaldecennial censusesundercountedthetotalpopulation,andfurther,missedcertainidentifiablepopulationgroupsatgreaterratesthanothers.Specifically,theseevaluationsclearly showedthatundercountswerenotmerelyrandomoccur-rences,butpredictablebiasesinthecensustakingpro-cess.TheundercounthasbeenconsistentlyhigherfortheAfrican-Americanpopulationthanfortherestofthepopu-lation,andwhilethedatasetisnotsoextensive,theevi-dencealsopointedtoconsistentlyhigherundercountsforHispanics,Asians,PacificIslanders,andAmericanIndians thanfortheWhitenon-Hispanicpopulation.Theunder-countwasalsorelatedtosocioeconomicstatus,chieflymeasuredbyhomeownership,withrentershavingconsis-tentlyhigherundercounts.TheU.S.CensusBureau designedtheAccuracyandCoverageEvaluationtomea-surethisdifferentialundercountand,ifpossible,correctthecounts,therebymakingthecensusmoreaccurate.Asmentionedearlier,theA.C.E.wasdesignedtoservetwopurposes.Onegoalwastomeasurecoverageofthepopu-lation,bothtotalandinvariousmajorsubdivisionssuchasrace/ethnicity,sex,majorgeographicalareas,andsocioeconomicalgroupings.Thesemeasurementsindicatewhetherchangesmadeinenumerationmethodsinthe2000censusweresuccessfulinimprovingthecensusand showwhereimprovementsmaybenecessaryinfuturecensuses.Anothergoalwastoprovidedatathatcouldserveasthebasisforcorrectingthecensuscounts.InplanningtheA.C.E.,theCensusBureaufocusedontheaccuracyofpopulationtotalsforbothgeographicareasanddemographicgroups.Considerationwasgiventothepossibilityofbothimprovingthepopulationtotals (numericaccuracy)andpopulationshares(distributiveaccuracy).Althoughearlyplanningconsideredusingdualsystemestimationtoproduceaonenumbercensus,aftertheSupremeCourtruledontheuseofsamplingfor congressionalapportionmentin1999,thesurveywasredesignedandrefocusedonnon-apportionmentuses.Oneimportantusewascongressionalredistricting.Thus animportantconsiderationinthedesignwastoimprovetheaccuracyofcongressionaldistricts,whichaveragearound650,000people.TheU.S.CensusBureaualsorec-ognizedotheruses,includingstateandlocalredistricting,fundsallocation,andprogramadministration.Thetradi-tionalgoalsofcoverageevaluationtoinformusersand aidintheplanningofthenextcensuscontinuetobe important.Thesegoalsgreatlyinfluencedthesampleandestimationdesign.TheA.C.E.DefinedTheA.C.E.isapost-enumerationsurvey,basedonthetheoryofdualsystemestimation.Theresultsofthedualsystemestimationcanbeusedwithmodel-basedestima-tiontoproducecensusfilesadjustedforthemeasurednetundercount(ornetovercount).Thedesigninvolvedcomparing(matching)theinformationfromanindepen-dentsamplesurveytoinitialcensusrecords.Inthisprocess,theCensusBureauconductedfieldinterviewingandcomputerizedandclericalmatchingof records.Usingtheresultsofthismatching,theCensusBureauapplieddualsystemestimationtodevelopestimatesofcoverageforvariouspopulationgroups.The initialplansweretoapplycorrectionfactorstothecensusfilesthatcouldbeusedtoproduceallrequiredCensusSectionIChapter11-1IntroductiontotheA.C.E.U.S.CensusBureau,Census2000 2000tabulations,otherthanapportionment.Thecorrec-tionaspectofCensus2000tabulationswaslateraban-doned.TheA.C.E.canbesummarizedasfollows:*Selectastratifiedrandomsampleofblocksforthe A.C.E.*CreateanindependentlistofhousingunitsinthesampleofA.C.E.blocks.*Beginconductingtelephoneinterviewsofhousingunitsthatmailedinacompletedquestionnaireandthatcouldbeclearlylinkedtoatelephonenumber.*Aftertheinitialcensusnonresponsefollow-up,conductapersonalvisitinterviewateveryhousingunitonthe independentlistnotalreadyinterviewedbytelephone.*MatchtheresultsoftheA.C.E.interviewtothecensusandviceversa.*Searchthecensusrecordsforduplicates. | |||
*Resolvecasesthatrequireadditionalinformationformatchingbyconductingapersonalvisitfollow-upinter-view.*Usetheinformationfromother,similarpeopletoimputemissinginformation.*CategorizetheA.C.E.databyage,sex,tenure,race/ethnicityandotherappropriatepredefinedvari-ablesintoestimationgroupingscalledpost-strata.*Calculatethecoveragecorrectionfactorsforeachpost-stratumusingthedualsystemestimator.*Ifappropriate,applythecoveragecorrectionfactorstocorrecttheinitialcensusdatausingamodel-basedesti-matorandtabulatethestatisticallycorrectedcensusresults.ThereareanumberofassumptionsinherentintheA.C.E.Properapplicationofthedualsystemestimation(DSE) modelrequirestheA.C.E.beconductedindependentlyofthecensusandthattherulesusedtodeterminecorrectenumerationsarethesameastherulesusedtodetermine caseseligibleformatches.TheDSEmodelcanbesensitive tomeasurementerrors.ItisimportanttoobtainconsistentreportingofCensusDayresidence.Inclusionoffictitiouspersonsanderrorsinmatchingcandirectlyinfluencethe DSE.Thereareotherassumptionsnecessaryindevelopingmodelsforhandlingnonresponseandothermissinginfor-mation.TheA.C.E.designwasbasedverymuchonthe theoreticalconceptsdiscussedandpubliclypresentedbytheCensusBureauinadvanceofthecensus.Thesecon-ceptsincludedcarefulattentiontostatisticalindepen-dence,astrictapplicationoftheconceptsofsufficient information,andcarefulattentiontobalancingthecon-ceptsusedtomeasurecensusmisses,aswellascensuserroneousinclusions.Foramoredetaileddiscussionof thisapproachseeHogan(2000).DesignLimitationsoftheA.C.E.TheA.C.E.wasdesignedtomeasurethehouseholdpopu-lationforlargesocial,economic,ethnic,racialandgeo-graphicgroupsandcomparethemwiththecensuscounts. | |||
Theresultsprovideameasureofnetundercountanda mechanismtocorrectthatnetundercount,ifthatappears advisable.AlthoughthegoaloftheA.C.E.wastomeasure thenetundercount,italsoprovidesinformationonthe separatecomponentsofthenetundercountsuchasomis-sionsandvarioustypesoferroneousenumerationsinthe census.Measuresofgrosserrorcannotbeobtained directlyandexclusivelyfromthesecomponentsbecause ofthestrictdefinitionofcorrectthatisneededtoimple-mentthedualsystemestimator.Forexample,A.C.E.treats censusenumerationsasnotcorrectlyenumeratedifthey lackedsufficientinformationforaccuratematching.This requirementallowsformoreprecisematching,but increasesboththenumberofnonmatchingcasesandthe numberofcasescodedaserroneous.Asimilarstrictrule oncorrectblocklocationofanaddressalsoincreasesboth thenon-matchesanderroneousenumerations.Theserules maybeinapplicableinthecensusoutsidetheDSEcon- | |||
text.ThedesignoftheA.C.E.doesnotprovideinformationonverylocaloruniqueerrorsinthecensusprocess.Specifi-cally,theA.C.E.wasnotdesignedtocorrectforparticularerrorsmadeby,say,acensustakeroralocalcensusman-ager,ortocorrectforlocalerrorsinthecensusaddress list.TheCensusBureauhadotherprogramsinplacetodealwiththeseissues,suchasthequalityassurancepro-cess,thecoverageimprovementfollow-up,andthelocal updateofcensusaddresses.TheA.C.E.wasdesigned, rather,tocorrectforlargesystematicerrorsincensustak-ing,mostespeciallythehistoricdifferentialundercount.Finally,theA.C.E.wasnotdesignedtomeasuretheunder-countforsomespecialpopulationgroupssuchasthegroupquarterspopulation(includingcollegedormitories, institutions,andmilitarybarracks),thepopulationthatuseshomelesssheltersand/orsoupkitchens,ortheremoteareasofAlaska.TheCensusBureauinstitutedspe-cializedproceduresforthesegroupsinordertoachieve thebestcountpossible.ExtendingtheA.C.E.methodstoallofthesepopulationswouldhavebeenverycostlyanddifficulttoimplementproperly. | |||
HISTORYStartingwith1950,everycensushasincludedaformalstudyofthecoverageofthepopulation.The2000Accu-racyandCoverageEvaluation(A.C.E.)isverymuchacon-tinuationofthattradition.1950through1970TheU.S.CensusBureauconducteditsfirstpost-enumerationsurvey,orPES,aspartofthe1950census. | |||
Theessentialelementsinapost-enumerationsurveyarea1-2SectionIChapter1IntroductiontotheA.C.E.U.S.CensusBureau,Census2000 secondattempttoenumerateasampleofhouseholdsand,usingcase-by-casematching,todeterminethenumber andcharacteristicsofpeoplenotincludedinthefirstcen-susenumeration.ThisfirstPESwasnotbasedondualsys-temestimation.DuringthenexttwodecadestheCensusBureauexperi-mentedwithalternativecoveragemeasurementmethodsbasedoncase-by-casematchingincludingaReverse RecordCheck,administrativerecordchecks,andamatchtotheCurrentPopulationSurvey.Inaddition,therewerevariousalternativeversionsofPESdesigns.Soonafterthecompletionofthe1950census,methodsofaggregatedemographicanalysisforcoverageanalysisweredevelopedatPrincetonUniversitybyAnsleyCoale andcolleagues.SeeCoale(1955),CoaleandRives(1973),andCoaleandZelnick(1963)fordetails.Demographicanalysis(DA)istheconstructionofanestimateofthe truepopulationusingbirth,death,migrationandotherdatasources.Thismethodologycanprovideindependentmeasuresofthecensusnetundercountbyage,sex,and Black/non-Black;however,itissubjecttoitsownlimita-tionsanduncertainties.AnimportantlimitationisthelackofdatatoindependentlyestimatetheHispanic,Asian,andAmericanIndianpopulationsorotherdetaileddemo-graphicgroups,suchashomeownersorrenters.Norcandemographicanalysisprovideestimatesforgeographicareasbelowthenationallevel.Inaddition,thelevelof emigrationandundocumentedimmigrationmustbeestimatedusingindirectmethods.SincetheU.S.onlyhadreasonablycompletebirthregistrationsince1935,sophis-ticatedanalysiswasneededin1950forthepopulation overage15.Earlystudieswererestrictedtothenative-bornWhitepopulation,butwithtimewereexpandedtoincludethenative-bornAfrican-Americanpopulationas | |||
well.LaterworkattheU.S.CensusBureaubyJacobSiegelandcolleaguesexpandedtheestimatestothetotalpopulation, withthefirstofficialestimatesbeingissuedinconjunctionwiththe1970census(Siegel,1974).The1970estimatesrecognizedtheneedtoaddresstheproblemofracemis-classificationinthecompletecount.Bythetimeofthe1970census,thepopulationcoveredbybirthregistrationincludedthoseunderage35,withtestsofbirthregistra-tioncompletenesshavingbeenconductedin1940,1950, andthemid-1960s.Medicaredatanowprovidedabasisforestimatesforthoseoverage65.However,thedifficultyofmeasuringmigration,animpor-tantcomponentofDA,gainedattention.ThesestudiesnotedThefiguresonnetimmigrationforthe1960to1970decadeshouldbeconsideredasestimatessubjectto considerableerror.Importantly,theestimatesdidnotincludeanyallowancefor...unrecordedalienimmigration,particularlyillegalimmigration.SeeSiegel(1974)for moredetails.Duringthesesamedecades,themethodsofdualsystemestimationwerebeingrefinedforuseinthehumanpopu-lation.Althoughintroducedoveracenturyagoforusein animalpopulations,dualsystemestimationwasfirstused withhumanpopulationsinanimportantarticlebySekar andDeming(1949)thatappliedthetechniquetomeasur-ingbirths.Dualsystemestimationwaswidelyusedto measurebirthsanddeathsindevelopingcountriesduring the1970sinconjunctionwithimportantoperationaland theoreticalwork.Theideasfromdualsystemestimation soonappliedtopost-enumerationsurveys.Seemost importantlyMarks(1979). | |||
1980ThedesignoftheA.C.E.tracesmostdirectlytothe1980Post-EnumerationProgram(PEP).Thiswasthefirstlargescalepost-enumerationsurveytousedualsystemestima-tion.Inaddition,itincludedseveralimportantinnova-tions,aswellasimportantlessonsonthedesignofaPES.The1980PEPwasbasedonamatchofpeopleincludedintheAprilandtheAugustCurrentPopulationSurveytothe1980census.Thismatchwasusedtodeterminethepro-portionofpeoplecountedinthecensus.Itwasasample ofpeopleknowntoexistandberesidentsoftheU.S.,andwaslabeledthePopulationorPsample.Allmatchingwasdonebyclerksandtechnicians.Inordertomakeitpossibletodothematching,eachpersons addressneededtobeassignedthecorrectcensusgeo-graphiccode(geocoded).Thisprocesswasslowanderrorprone.Inaddition,aseparatesampleofcensusrecordswasdrawn.ThiswasknownastheEnumerationorEsample.ThecensusrecordsincludedintheEsamplewerecheckedintheofficetoseeiftheywereduplicated,followedbya fieldoperationtodeterminewhetherthepeoplewerereal,livedattheaddressonCensusDay,andwhethertheunitwasassignedthecorrectcensusgeographiccode(cor-rectlygeocoded).Oneimportantconceptintroducedin1980wasthatofsufficientinformationformatching.Sufficientinformation formatchingmeansthatarecord,fromeitherthePorEsample,containssufficientinformation,includingmostimportantlyaname,toallowaccuratematchingand follow-up.Recordsthatlackthisinformationareremoved frommatching,processingandestimation.FortheEsample,thisexclusionisdoneintwoparts:censusimputedrecords(non-data-defined)areexcludedfrom thesamplingframe,andthensampleddata-definedrecordsarereviewedfornameandothernecessaryinfor-mation.Anotherconceptusedearlierbutmadeexplicitin1980wasthatofsearcharea.Apersonwasonlyconsideredcorrectlyenumeratedifhe/shewascountedinaspecific,SectionIChapter11-3IntroductiontotheA.C.E.U.S.CensusBureau,Census2000 definedareathatincludedtheaddresswherehe/sheshouldhavebeenenumerated.Thissearchareawasto beappliedtoboththePandtheEsamples.The1980PEPwasalso,veryimportantly,thefirstPEStobe,itself,carefullyevaluated(Fayetal.,1988).Thisevalu-ationprovedinvaluabletothedesignofthe1990PES.Amongtheimportantfindingswere:*Samplingvarianceswereveryhigh.*Geocodingasampleofhousingunitswascostlyanderrorprone.*DrawingindependentPandEsamplesmadeitveryhardtoapplythesameconcepts,especiallythatofsearch area.*Levelsofmissingdataneededtobereducedandmeth-odstoaccountforthemissingdataneededtoberefined.*Matchingneededtobemademoreaccurateandfaster. | |||
*Anindependentsampleofpeoplelivingininstitutionsprovednearlyimpossibletomatchandprocess,bothbecausetheinterviewsreliedonthesamesetofadmin-istrativerecordsandbecauseadministratorsoften refusedtogivenames,eventotheCensusBureau.By1980,theprecisionofdemographicanalysisbenefitedfromthefactthatthepartofthepopulationnotcovered byeitheradequatebirthregistrationdataorMedicaredatawasnowreducedtoonlythose45to65(in1980).However,immigration,especiallyillegal/undocumented/ | |||
unauthorizedimmigration,remainedaproblem.Earlydemographicestimatesfor1980,whichagaindidnotcontainanallowanceforillegalimmigration,showedanet overcountofthepopulation.However,pathbreakingworkbyJeffPasselandcolleaguesproducedthefirstestimatesofthenumberofillegalimmigrantscountedinthecensus. | |||
Thisworkwasgenerallyvalidatedwhendatafromthe ImmigrationReformandControlAct(IRCA)producedsimilarnumbersofimmigrantsapplyingforlegalization.Althoughthe1980PEPwasnotexplicitlydesignedtocorrectthecensusformeasuredundercount,itwasthe firstPEStobeconsideredinthiscontext.Increaseduseofcensusresultsforcongressional,state,andlocalredistrict-ing,aswellasforfederalfundsallocationhighlightedthe importanceofcensusaccuracy.Thevotingrightscasesofthe1960s(Bakerv.Carr(1962),Reynoldsv.Simms(1964))hadgreatlyincreasedtheimportanceofcensus datainredistricting.GeneralRevenueSharingfunds, distributedinpartbasedoncensusdata,becameanimportantsourceoflocalgovernmentrevenueinthemid1970s.Thelegalandstatisticalquestionswerediscussed inacademicjournalsandaspartofseverallawsuits,includinginfluentialsuitsbytheCityofDetroitandtheCityofNewYork.TheU.S.CensusBureauspositionwasthatthe1980PEPwasnotofsufficientaccuracyforthis purpose,andthisdecisionwasupheld. | |||
1990Buildingontheknowledgegainedin1980,theCensusBureaumademajordesignchangesforthe1990PES. | |||
Importantchangesincluded:*Excludinginstitutionalpopulationandmilitaryships/barracksfromtheuniverse.*Theuseofablocksampletiedtocensusgeographiccodes,withthesamesampleofblocksusedforboth thePandtheEsample.*Repeatedcall-backtoreducenonresponseandmissing data.*Acomputerandcomputer-assistedclericalmatching operation.*Amodeltoaccountformissingdatatakingintoaccounttheimportantcovariates.Thedesignoftheestimationcells(post-strata)wascom-pletelychanged.FollowingtheadviceofJohnTukeyandothers,theestimationcellswerenotrestrictedtoasingle state,butallowedtocrossstatelines.Thus,HispanicslivinginUtahcouldbecombinedwithHispanicslivinginColoradoandothermountainstatestoformoneestima-tioncell,ratherthanbeingcombinedwithnon-Hispanics livinginUtah.AsmoothingmodelwasusedtocombineinformationwithinCensusRegion.The1990PESwasexplicitlydesignedsothatitcouldbeusedtoadjustthecensusresults.Specifically,model-basedmethodsweredevelopedtocarrytheestimates downtothesmallestcensusgeographicunits(blocks)andtoincludepositiveornegativewholepersonrecordstoaccountforthemeasurednetundercountorovercount. | |||
Thiscompletefilecouldthenbeaggregatedtoobtaindatathatwasconsistentforallgeographicallevels.Manylessonswerelearnedin1990,manyhavingtodowiththeneedfortightoperationalcontrolandtesting. | |||
Oneimportantstatisticallessonconcernedtheuseofthestatisticalsmoothingmethods.Thesemethodsbecamehighlycontroversialandbecamethefocusofmuchstatis-ticalanalysisanddebate.Theywerenotwellunderstood andtheU.S.CensusBureaudecidedtodroptheuseofsmoothingandinsteadrecomputetheresultswithfewerandthuslargerestimationcells.Demographicanalysisestimateswentverysmoothlyin1990withbirthregistrationandMedicaredatacoveringallbutthoseage55to65.TheIRCAdataandtheworkofJeffPassellandothers(seeFayetal.,1988,Chapters21-4SectionIChapter1IntroductiontotheA.C.E.U.S.CensusBureau,Census2000 and3)providedanallowanceforundocumentedimmi-grants.Further,forthefirsttime,theCensusBureaupro-ducedexplicitallowancesfortheuncertaintyinthedemo-graphicanalysisestimates.Thisanalysisshowedthatthe preferredorpointdemographicanalysisestimates tendedtofallatthelowerendoftheuncertaintyrange. | |||
However,thismethodofexpressingtheuncertaintyrange cameundercriticismfromoutsidetheCensusBureau. | |||
LimitationsofthismethodaredocumentedinRobinsonet al.(1993)andHimesandClogg(1992).The1990DemographicAnalysisestimateswereingeneralagreementwiththeresultsofthe1990PES.Atthenationallevelthetwoestimateswereveryclose1.8percentundercountfordemographicanalysis(later revisedto1.7percent)and1.6percentforthePES.Atmoredetailedlevels,differencesemerged,especiallythetendencyforthePEStogreatlyunderestimatetheunder-countforadultAfrican-Americanmales.Takingintoaccountwhatwasknownaboutthebiasesanduncertain-tiesofeach,itseemedclearthatbothweremeasuringa realdifferentialundercounteventhoughPESwasunderes-timatingtheamountforadultAfrican-Americanmales. | |||
2000Intheearly1990s,taskforcesandNationalAcademyofSciencePanelssuggestedthatthedifferentialundercountinthecensuscouldnotbereducedwithoutelaborate enumerationandmatchingprocedures,whicharetoo costlytobecarriedoutexceptonasampleofthepopula-tion.Inthe1995and1996CensusTests,analternativeCensusPlusmethodologywascomparedtotheDSE.The performanceoftheDSEwasbetterandsubsequentresearcheffortsfocusedonimprovingtheDSE.Conse-quently,mostoftheA.C.E.designcanbeseenasa continuationandrefinementofthe1990PESdesign. | |||
Amongtheimportantrefinementsare:*Muchlargerandbetterdesignedblocksample.*Earlierinterviewing,includingtheuseofearlytelephone interviewing.*Computer-assisted(laptop)telephoneandpersonal interviewing.*Morerefinedestimationcells(post-strata). | |||
*Explicitcollapsingrulestoaccountforsmallcellsize. | |||
*Explicitweighttrimmingrulesincaseofextraordinary(outlier)cells.Thesurveyuniversewasrestrictedtothehousingunit/householdpopulation.Allgroupquarters,notjustmilitaryandinstitutional,populationswereexcluded. | |||
Consequently,theA.C.E.estimateofcoverageerrorwillbeunderestimatedtotheextenttherewereerrorsinthegroupquarterspopulation.AnotherconcernisthetreatmentintheDSEofcasesinvolvedintheHousingUnitDuplicationOperation (referredtoaslatecensusadds)andthelevelofwhole personimputationsinthecensus.Theserecordswere notincludedintheA.C.E.matching,processing,or follow-upprocesses.Theywerealsoexcludedfromthe DSE,althoughproperlyaccountedforincomputingthe netundercount.Itispossiblethat,hadtheserecords beenincludedintheA.C.E.andtheDSE,theestimated undercountwouldhavediffered.Thenumberof excludedrecordsismuchlargerthanitwasin1990.If theratioofmatchestocorrectenumerationsisthe samefortheexcludedandincludedcases,theDSE expectedvalueshouldbenearlythesame.However,if thepeoplereferredtointhecorrectcaseswereeither muchmorelikelytohavebeenincludedintheA.C.E.or muchlesslikelytohavebeenincluded,thenexcluding thesecasesfromtheA.C.E.wouldhavechangedthe levelofcorrelationbiasandaffectedtheA.C.E.For moredetail,seeHogan(2001).TherewasachangeinthetreatmentofpeoplewhohadmovedbetweenApril1andthetimeofthePESinter-view.In1980and1990,thesemoversweresampled attheircurrent(i.e.PESInterviewDay)address.IntheA.C.E.,theyweresampledattheirCensusDay,April1,address.Althoughconceptuallymuchthesame,theimplementa-tionofthesearchareawasverydifferent.In1990,the entiresearchareawasalwaystobesearchedforallcasesinordertofindmatchesorduplicates,andallcasesweremap-spottedtodeterminewhetherthey wereinsidethesearcharea.In2000,thesearchofthe surroundingblockswasrestrictedbybothtargetingandsampling.First,thesurroundingblockwassearchedforonlycertainkindsofcases,specificallycaseswhere therewasalikelihoodofgeocodingerrorinthebasiccensusprocess.Inaddition,astratifiedsub-samplewastakenforthissearch,withonlysomeoftheinitial sampleblockssubjectedtothisextendedsearch.This processwasknownasTargetedExtendedSearch,or TES.Becauseofthedifficultyinexplaininganddefendingthe1990smoothingmethods,smoothingmodelswerenotemployed.Instead,theA.C.E.relieduponalarger samplesizeandamorerefinedsetofestimationcellstoproduceestimates.Finally,althoughthiswasnotaseparatestep,theA.C.E.wassubjectedtomuchmoreexactingspecification,documentationandtestingthananypreviouscoverage measurementstudy.MuchoftheoperationalsuccessoftheA.C.E.canbetracedtothecareandattentiongiventodocumentationandtesting.SectionIChapter11-5IntroductiontotheA.C.E.U.S.CensusBureau,Census2000 ThisdocumentisthenverymuchpartoftheoverallA.C.E.process.Itattemptstodocument,conciselyandclearlyaswellaspreciselyandaccurately,theA.C.E. | |||
design.1-6SectionIChapter1IntroductiontotheA.C.E.U.S.CensusBureau,Census2000 Chapter2.AccuracyandCoverageEvaluationOverview INTRODUCTIONTheAccuracyandCoverageEvaluationSurvey(A.C.E.)wasdesignedprimarilytomeasurethenetundercoverageorovercoverageinthecensusenumeration.Themethodol-ogyusedwasdualsystemestimationthatrequirestwo independentsystemsofmeasurement.ThePsampleorPopulationsamplemeasuredthehousingunitpopula-tion,asdidthecensus,butwasconductedindependently ofthecensus.Thiswasdonebyselectingasampleofblockclusters,geographicallycontiguousgroupsofblocks,andinterviewinghousingunitsthatwereobtained byindependentlycanvassingeachblockcluster.TheresultsofthePsamplewerematchedtocensusenumerationstodeterminetheomissionrateinthe census.Additionally,asampleofcensusenumerations, theEsample,wasselectedtomeasuretheerroneousenu-merationrateinthecensus.TheEsamplewascomprisedofcensusenumerationsinthesamesampleblockclusters asthePsample.Theseoverlappingsamplesreducedvarianceonthedualsystemestimator,reducedtheamountoffieldactivitiesandtheircost,andresultedin efficientdataprocessing.TherewereconsiderablechallengesintheimplementationoftheA.C.E.OneoftherequirementsoftheA.C.E.wastoproducemeasuresofnetundercountorovercountshortly afterthecensuscountswerecompiled.Thiswasadaunt-ingtaskbecausetherequirementforindependencemeantthatA.C.E.activitiescouldnotinterfere,orinanyway affecttheresultsofthecensusenumerations,orvice versa.Aswithmostsurveys,theA.C.E.consistedofdesigningasample,creatingaframe,selectingthesample,conductingtheinterviews,dealingwithnonre-sponsesandmissinginformation,aswellasproducingtheestimates.Inaddition,theA.C.E.hadseveralmatchingandfieldfollow-upactivities.Inordertoaccomplishthese tasksandmeetthegoalsoftheA.C.E.inatimelymanner, itsdesignwasuniquelybuiltaroundcensusoperations.Additionally,toensurequalitywithsuchacompressedtimeschedule,itwasessentialthatsoftwaresystemsbe writtenandthoroughlytestedpriortothestartofanactivity.OnecensusoperationthathadmajorinfluenceontheA.C.E.designandestimationplanwastheHousingUnit DuplicationOperation.Asthecensusquestionnaireswerebeingprocessed,theCensusBureaususpectedthattherewasasignificantnumberofduplicateaddressesinthe censusfiles.Toaddressthesuspectedhousingunitduplication,theHousingUnitDuplicationOperationwasintroducedinthefallof2000.SeeNash(2000)forfurther details.Theprimarygoalofthiscensusoperationwastoimprovethequalityofthecensus;however,itsdesignallowedtheA.C.E.operationstoproceed.Essentially,sus-pectedduplicatehousingunitsweretemporarilyremoved fromthecensusfiles,whilefurtheranalysiswasdoneforthesecases.Approximately5.9millionpersonrecordswereinthesesuspectedduplicatehousingunits,which were:1)out-of-scopefortheE-samplecomponentoftheA.C.E.,2)notavailableforthepersonmatchingincludingtheidentificationofpersonduplicatesintheEsample,and 3)excludedfromthecensuscomponentinthedualsys-temestimates.Approximately2.3millionpersonrecordswerereinstatedintothecensusaftertheEsamplewas selectedandwerereflectedinthenetcoverageestimates. | |||
Hogan(2001)showedthatexcludingthesepersonrecordsfromtheA.C.E.wouldnotaffectthedualsystemesti-mates,ifthenumberofP-samplematcheswasreduced proportionatelytothenumberofE-samplecorrectenu-merations.ThischaptersummarizesthemajoractivitiesoftheA.C.E.andindicatestheirrelationshiptothecensus.SubsequentchaptersgointoconsiderablygreaterdetailaboutthemethodologyoftheA.C.E.andareorganizedasfollows:*Chapter3.DesignoftheA.C.E.Sample*Chapter4.A.C.E.FieldandProcessingActivities | |||
*Chapter5.TargetedExtendedSearch | |||
*Chapter6.MissingDataProcedures | |||
*Chapter7.DualSystemEstimation | |||
*Chapter8.Model-BasedEstimationforSmallAreas TheintentofthischapteristoprovideabroadcontextforthedesignoftheA.C.E.Herewegiveasequentialaccountingoftheseactivities.Table2-1givestheorderin whichtheA.C.E.activitiesoccurredandmapstheactivi-tiestothechapterwhereeachisdiscussedinfurtherdetail.Thistableshowsthesubstantialintegrationofthe samplingandoperationalactivities.Figure2-1showsthe flowofthemajoractivities.SectionIChapter22-1AccuracyandCoverageEvaluationOverviewU.S.CensusBureau,Census2000 Table2-1.SequenceofA.C.E.ActivitiesActivityDescriptionChapter(s)1First-phasesampling32Independentlisting4 3Second-phasesampling3 4Initialhousingunitmatching/fieldfollow-up4 5Targetedextendedsearch4&5 6Subsamplingwithinlargeblockclusters3 7A.C.E.personinterviewing4 8E-sampleidentification3&4 9Personmatchingandfieldfollow-up4 10Missingdataprocessing6 11Dualsystemestimation7 12Model-basedestimationforsmallareas8Table2-2furtherillustratestheintegrationofthesamplingactivitiesandoperationsbysummarizingthesamplesizeateachphaseofsamplingandtheoperationsforwhichthesampleisaninput.Thedatacollectedfromeach operationisinputtothenextsamplingoperation.Forexample,thefirstphaseofsamplingresultedin29,136sampleareaswithalmost2millionhousingunits. | |||
Independentaddresslistswerecreatedfortheseareas.Theresultsoftheindependentlistingwereusedinthesecondphaseofsampling.Activity1.First-PhaseSamplingTiming:MarchthroughJune,1999;priortothecreationofthecensusaddresslist.AtthetimeoftheJanuary,1999SupremeCourtrulingagainsttheuseofsamplingforapportionment,theCensusBureauwasheavilyinvolvedinthefirstphasesofsamplingfortheIntegratedCoverageMeasurement(ICM). | |||
ThegoaloftheICMwastoproducereliableestimatesof coverageofeachstatestotalpopulation,andthisrequiredaverylargesample-a750,000housingunitsamplewasplanned.AsaresultoftheSupremeCourtruling,state populationestimatesforapportionmentwerenolongerkeyestimatesofthecoveragesurvey;instead,thegoalwastomeasurecensuscoveragefornationalandsubna-tionalpopulationdomainshavingdifferentcensuscover-ageproperties.Theseestimatescouldbemeasuredwithsufficientprecisionwithasampleofabout300,000 housingunits.Ratherthanabandoningtheeffort,i.e.,softwaredevelop-ment,etc.,thathadalreadybeeninvestedintheICM,it wasmoreefficient,particularlyfromasoftwarequality perspective,tocompletethesamplingfortheICM,and thenselectasubsamplefortheA.C.E.Theinfrastructure forthefieldstaffwasbeingdeployedinpreparationfor thefirstfieldoperationthatstartedinSeptember,1999, andthedevelopmentofthesamplingsystemthatwas scheduledtobeginproductioninMarch,1999waswell underway.Therewasnotadequatetimetoredesignthe A.C.E.sampleallocationentirely,selectthesample, producethedifferentlistingmaterialsincludingmaps, conductthelistingasscheduled,andensureahighlevel ofqualityinarevisedsoftwaresystem.Consequently,the A.C.E.sampledesignwasderivedfromtheICMdesign usingadoublesamplingapproach.TheentireICMsample wasselectedasoriginallyplannedandthenreduced throughvariousstepstoyieldtheA.C.E.targethousing unitsample.Thefirst-phasesamplingconsistedof:*Formingprimarysamplingunits.*Stratifyingprimarysamplingunits.*Systematicsamplingofprimarysamplingunits.TheA.C.E.primarysamplingunitwastheblockcluster,agroupofoneormoregeographicallycontiguouscensusblocks.Tomakeefficientfieldworkloads,thetargetsize ofblockclusterswasabout30housingunits,although blockclustersvariedinsize.Withineachstate,blockclusterswerestratifiedbysizeusinghousingunitcountsfromapreliminarycensusaddresslist:small(0to2 housingunits),medium(3to79housingunits),andlarge(80ormorehousingunits).Somestatesincludedasepa-ratesamplingstratumforAmericanIndianReservations. | |||
Withineachsamplingstratum,asystematicsampleof blockclusterswasselectedwithequalprobability.Thisphaseofsamplingyielded29,136blockclusterswithanestimated2millionhousingunitsinthe50statesandtheDistrictofColumbia.Table2-2.SampleSizesbySamplingPhaseandOperationSamplingphaseSamplesize OperationsAreasHousingunitsFirst-phase... | |||
...............................29,1361,989,000Independentlisting Second-phase...............................11,303844,000Initialhousingunitmatching/follow-upSubsamplingwithinlargecluster(P-sample)....11,303301,000A.C.E.personinterviewing,personmatching/follow-up,dualsystemestimationE-sampleidentification.......................11,303311,000Personmatching/follow-up,dualsystemestimation2-2SectionIChapter2AccuracyandCoverageEvaluationOverviewU.S.CensusBureau,Census2000 Activity2:IndependentListingTiming:Septemberthroughearly-December,1999;wellbeforecensusenumerationbegan.Fieldstaffvisitedthesampleblockclustersandcreatedanindependentaddresslistofallhousingunits,including housingunitsatspecialplaces.ThegoalofthisoperationwastocreateanindependentaddressframeofallthehousingunitsthatwerelikelytoexistonCensusDay, April1,2000.SincethisoperationoccurredpriortoCensusDay,anypotentialhousingunitstructureswereincludedontheindependentaddresslist.Later,during housingunitfollow-up,thesestructureswerevisitedtoconfirmthattheyactuallycontainedhousingunitsonCensusDay.Sincehousingunitscouldnotbeadded totheindependentaddressframeinthislateroperation,butcouldberemoved,itwasimportanttoincludestruc-tureswithquestionablehousingunitstatusduringthe independentlisting.Thislistingconsistedofapproximately2millionhousingunitsorpotentialhousingunitsinthe50statesandthe DistrictofColumbia.Activity3:Second-PhaseSamplingTiming:December,1999throughFebruary,2000;priortomailingthecensusquestionnaire.ThesecondphaseofsamplingselectedblockclustersfromthefirstphasetobethefinalA.C.E.sampleareas. | |||
Blockclusterswerestratifiedusingtwohousingunitcounts:1)acountfromtheindependentlistingoperation,and2)acountfromtheupdatedcensusaddresslistasof January,2000.Itwasimportanttoreducethefirst-phasesamplebeforethenextoperations,thehousingunitmatchingandfieldfollow-up,toreducethenumberof clustersgoingintothoseoperations.Thestratificationoftheblockclusterswasdoneseparatelybyfirst-phasesam-plingstrata:1)mediumandlargestrata,and2)small strata.Allfirst-phaseclustersfromtheAmericanIndian Reservationstratumwereretainedinthesecond-phase sample.Mediumandlargestrata.Theresultingnationalsampleallocationwasroughlyproportionaltostatepopulation withsomedifferentialsamplingwithinstates.Thetwogoalsofthedifferentialsamplingwere:1)toprovidesufficientsampletosupportreliableestimatesforseveral sub-populations,and2)toreducethevariancecontribu-tionduetoclusterswiththepotentialforhighomissionorerroneousenumerationrates.Theseclusterswereiden-tifiedandputintoseparatesamplingstratabycomparing theconsistencyofhousingunitcountsbetweentheinde-pendentlistandtheupdatedcensuslistforeachcluster.Smallclusterstratum.Conductinginterviewsandfollow-upoperationsinsmallblockclustersismuchmore costlyperhousingunitthaninmediumorlargeblockclusters.Lowersamplingrateswere,therefore,usedinthisstratum.However,twoconsiderationsweretakeninto accountinestablishingthelowerrates.Onegoalwasto avoidhavingsmallclusterswithanoverallprobabilityof selectionmuchlowerthantheprobabilityofselectionof otherclustersinthesample.Asecondgoalwastohave higherprobabilitiesofselectionforsmallclustersinwhich thenumberofhousingunitswasgreaterthanthe expected0to2housingunits.Thesetwogoalsattempted toreducethecontributionofsmallclusterstothevariance ofthedualsystemestimates.Smallblockclusterswiththe potentialforhigherroneousenumerationornonmatch rateswereretainedathigherrates.Thesecond-phase samplecontained11,303blockclustersforthe50states andtheDistrictofColumbia.Activity4:InitialHousingUnitMatchingandField Follow-UpTiming:FebruarythroughApril,2000;priortocensusnonresponsefollow-up.Theobjectivesoftheseoperationswere:1.CreatealistofconfirmedA.C.E.housingunitsinorder to:*obtainthebestlistofhousingunitstofacilitateper-soninterviewinginlateractivities.*havebettercontrolofthefinalA.C.E.housingunitsamplesize.2.EstablishalinkbetweentheA.C.E.andcensushous-ingunitsinorderto:*identifytheA.C.E.housingunitseligiblefortele-phoneinterviewing.*facilitateoverlappingPandEsamples.3.Identifypotentialgeocodingerrorsinorderto:*establishthetargetedextendedsearchsampling frame.*identifysampleareasforwhichthecreationofanewindependentaddresslist,orrelisting,wasnecessary.Housingunitmatching.Thehousingunitsonthecen-susaddresslistinJanuary,2000werematchedtothe A.C.E.independentaddresslist.First,theaddresseswere computermatched.Thecomputermatchingwasfollowedbyaclericalreviewofthecomputermatchresultsinanautomatedenvironmentintendedtofindadditional matchesusingsupplementalmaterials.Therewasalsoaclericalsearch,limitedtotheblockcluster,forduplicatehousingunitsduringthisphaseofthematching.Possible duplicatesinboththeA.C.E.andthecensuswereidenti- | |||
fied.SectionIChapter22-3AccuracyandCoverageEvaluationOverviewU.S.CensusBureau,Census2000 Housingunitfollow-up.Insomecases,thecomputerandclericalmatchingwerenotabletodeterminethesta-tusofahousingunit.Fieldstaffvisitedthesecasestoget moreinformationaboutthesehousingunits.Aftermatch-ing,thecaseswhichwerenotmatched,possiblymatched, orpossibleduplicatesweresenttothefieldforfollow-up interviews.Someofthematchedcaseswerealsosentfor additionalinformation.Thefieldfollow-upwasdesigned todetermineifahousingunitexisted,ifitexistedinthe blockcluster,orifdifferentaddresseswerereferringto thesamehousingunit.Activity5:TargetedExtendedSearchTiming:May,2000.Thetargetedextendedsearchwasdesignedtoimprovetheaccuracyofthedualsystemestimatebysearchingfor matches,correctenumerationsandduplicatesoneringbeyondthesampleblockcluster.TheoperationwasimplementedinasubsetofA.C.E.blockclustersselected throughacombinationofcertaintyandprobability sampling.Therearecensusgeocodingerrorsofexclusionandinclu-sionintheA.C.E.sampleblockclusters.Censusgeocod-ingerrorsofexclusion(i.e.,housingunitsmiscodedinthe censussotheyappeartobeoutsidetheA.C.E.blockcluster)affecttheP-samplematchrate.Censusgeocodingerrorsofinclusion(i.e.,housingunitsmiscodedinthe censustoappearinsidetheblockcluster)affecttheerro-neousenumerationrateinthecensusorEsample.Ifthecensushousingunitisomittedfromthesampleblockcluster,theP-samplehouseholdcannotbematched.This yieldsalowermatchrate.OntheE-sampleside,ifahous-ingunitisincludedinthesampleblockclusterduetoageocodingerror,theE-samplepeoplewillbeconsidered erroneouslyenumerated.Theprimarymotivationforusinganextendedsearchareawastoreducethesamplingvarianceofthedualsystem estimatesduetocensusgeocodingerror.EventhoughtheextendedsearchallowedmoreP-samplepeopletobematchedandmoreE-samplepeopletobeconvertedtocorrectenumerations,theexpectedvalueofthedualsys-temestimateshouldnotbeaffectedaslongasthetwo samplesweretreatedequallywithrespecttothesearcharea.Anotherbenefitisthattheextendedsearchmakesthedualsystemestimatemorerobustbyprotecting againstpotentialbiasduetoP-samplegeocodingerror.Previouscensusevaluationshaveshownthatgeocodingerrorsarehighlyclustered.Thetargetedextendedsearchwasdesignedtotakeadvantageofthedistributionof geocodingerrorsbyfocusingonthoseclustersthatcon-tainthemostpotentialgeocodingerrors.Theimplementa-tionofthisoperationresultedindualsystemestimates withmoreprecision.TheinitialhousingunitmatchingresultswereusedtoidentifytheA.C.E.housingunitnonmatchesandcensus housingunitgeocodingerrors.ClusterswithoutA.C.E. | |||
housingunitnonmatchesorcensusgeocodingerrorswere out-of-scopeforthetargetedextendedsearchsampling. | |||
Changestothecensusinventoryofhousingunitsafter January,2000werenotreflectedinthehousingunit matchingusedtoidentifytargetedextendedsearch | |||
clusters.Onlywholehouseholdsofnonmatchedpeoplewereeligiblefortheextendedsearchduringpersonmatching.Partialhouseholdnonmatches(i.e.,somehouseholdmem-berswerematches)werenotaslikelytoindicatethatthehousingunitwasageocodingerror.Activity6:SubsamplingWithinLargeBlock ClustersTiming:AprilandMay,2000;duringcensusnonresponse follow-up.SubsamplingwasusedinlargeblockclustersforthefinalselectionofhousingunitstoparticipateinthePsample. | |||
Theobjectivewastoreducecostsandyieldmanageable fieldworkloadswithoutseriouslyaffectingtheprecisionoftheA.C.E.bytakingadvantageofthehighintra-classcorrelationexpectedinlargeblockclusters.Sincethe largeblockclustershadahigherinitialprobabilityofselectionthanmediumblockclusters,thereductioninsamplesizehadafairlyminoreffectontheprecisionof theA.C.E.estimates.Thesubsamplingofhousingunitswithinlargeclustersbroughttheoverallprobabilityofselectionofthesehousingunitsmoreinlinewithhousing unitsinthemediumclusters.Anyblockclusterwith80ormoreconfirmedA.C.E.hous-ingunits,basedontheinitialhousingunitmatch,was eligibleforthishousingunitreduction.Thereductionof housingunitswithinalargeblockclusterwasdonebyforminggroupsofadjacenthousingunits,calledseg-ments,andselectingoneormoresegmentsforA.C.E. | |||
personinterviewing.Thesegmentshadroughlyequalnumbersofhousingunitswithinablockcluster.Segmentsofhousingunitswereusedasthesamplingunitinorder toobtaincompactinterviewingworkloadsandtofacilitate theidentificationofanoverlappingEsample.TheA.C.E.housingunitsthatwereretainedafterallofthesubsam-plingcomprisethePsample.Afterthereductionofhousingunitswithinlargeblockclusterswascompleted,theA.C.E.interviewsamplesizeforthe50statesandtheDistrictofColumbiawasapproxi-mately300,000housingunits.Activity7:A.C.E.PersonInterviewingTiming:Aprilthroughmid-June,2000forthetelephonephase;Mid-Junethroughmid-September,2000fortheper-sonalvisitphase;aftercensusenumerationwascomplete.2-4SectionIChapter2AccuracyandCoverageEvaluationOverviewU.S.CensusBureau,Census2000 ThegoaloftheA.C.E.personinterviewwastoprovidealistofpersonswholivedatthesampleaddressonCensus Day,aswellasthosewholivedattheaddressatthetime ofA.C.E.interviewing.TheA.C.E.personinterviewwas conductedusingaComputerAssistedPersonalInterview (CAPI)instrument.Togetanearlystartoninterviewing,atelephoneinter-viewwasconductedathouseholdsforwhichthecensusquestionnairewasdata-capturedandincludedatelephone number.Bothhouseholdswithmailreturnsandenumerator-filledquestionnaireswereeligiblefortele-phoneinterviews.Certaintypesofhousingunits,suchas thosewithouthousenumberandstreetname,werenoteligibleforatelephoneinterview.Allremaininginterviewsfollowingthetelephoneoperationwereconductedinper-son.However,somenonresponseconversionoperationinterviewsandinterviewsingatedcommunitiesorsecuredbuildingswereconductedbytelephone.Thepersoninterviewwasconductedonlywithahouse-holdmemberduringthefirst3weeksofinterviewing.Ifaninterviewwithahouseholdmemberwasnotobtainedafter3weeks,aninterviewwithanonhouseholdmember wasattempted.Thiswascalledaproxyinterview.Proxyinterviewswereallowedduringtheremainderoftheinter-viewingperiod.Duringthelast2weeksofinterviewinga nonresponseconversionoperationwasattemptedforthe noninterviewsusinginterviewerswhowereconsideredtobethebestavailable.Activity8:E-SampleIdentificationTiming:October,2000. | |||
TheEsampleconsistedofthecensusenumerationsinthesamesampleareasasthePsample.Alldata-definedcen-suspersonrecordsintheA.C.E.blockclusterswereeli-gibletobeintheEsample. | |||
1Tobeacensusdata-definedperson,thepersonrecordmusthavetwo100-percentdataitemsfilled.Namewasnotrequiredfortheperson recordtobeconsidereddata-defined,butcouldbeoneofthetwoitemsrequiredtobedata-defined.LikethePsample,itwassometimesnecessarytosubsamplethe censushousingunitsinaclusterwhenitcontainedalargenumberofcensushousingunits.ThegoaloftheE-sampleidentificationwastocreateoverlappingPandEsamplesin anefforttoreducepersonfollow-upworkloads.Anover-lappingPandEsampleisnotnecessary,butimprovesboththecosteffectivenessofthesubsequentoperationandtheprecisionofthedualsystemestimates.Ifablockclusterhadfewerthan80censushousingunits,thenallofthecensushousingunitsintheblockclusterwereintheEsample.Forblockclusterswith80ormorecensushousingunits,thewithin-clustersegmentsofadja-centhousingunitsdefinedfortheP-samplereduction weremappedontothecensusrecords.Thiswaspossible whenalinkbetweenthecensusandA.C.E.housingunit wasestablishedduringtheinitialhousingunitmatching. | |||
Usingspecificrules,censushousingunitsthatdidnot havethislinkwereassignedtoasegment.Thesegment selectedforthePsamplewasselectedfortheEsample.If thesamplesegmentcontained80ormorecensushousing unitswithnoestablishedlinktoanA.C.E.housingunit, thenasystematicsampleofthesehousingunitswas selectedtoreducetheE-samplepersonfollow-upwork- | |||
loads.Thisresultedinapproximately311,000censushousingunitsintheEsampleforthe50statesandtheDistrictof Columbia.Activity9:PersonMatchingandFieldFollow-UpTiming:OctoberandNovember,2000.Insufficientinformationformatching.Ruleswereestablishedfordeterminingwhichpersonrecordshadsufficientinformationformatching.Theseruleswere establishedandappliedbeforethestartofthematchingoperationtoavoidintroducingpotentialbiasintothematchingresults.BoththePandEsamplesusedthesame rules.Eachpersonrecordrequiredacompletenameand twoothercharacteristics.Personmatching.AllP-samplepersonswholivedateachsamplehousingunitonCensusDaywerematchedto thepeopleenumeratedinthecensustoestimatethematchrate.CensuspersonsintheEsamplewhomatchedtothePsamplewereconsideredtobecorrectlyenumer-ated.TheE-samplepersonrecordsthatdidnotmatchto thePsamplewereinterviewedduringfieldfollow-upoperationstoclassifythemascorrectlyorerroneouslyenumerated.Thismatchingwasacomputeroperation withclericalreview.Variablessuchasname,address,dateofbirth,age,sex,race,Hispanicorigin,andrelationshipwereusedtoidentifymatchesbetweenthePsampleand censusenumerations.Duplicateswereidentifiedinboth thePsampleandEsample.Ifacasequalifiedfortargetedextendedsearch,thesearchformatchesandduplicateswasextendedtotheringbeyondthesampleblockcluster.Personfollow-up.Thepersonfollow-upinterviewcol-lectedadditionalinformationthatwassometimesneces-saryfortheaccuratecodingoftheresidencestatusofthe nonmatchedP-samplepeopleandtheenumerationstatus ofthenonmatchedE-samplepeople.ThegoalofthisoperationwastoconfirmthatambiguousP-samplenon-matchesactuallylivedinthesampleblockclusteron CensusDay.Thus,follow-upinterviewsforP-samplenon-matchedcaseswerecarriedoutwhentherewasapossibil-itytheresidencestatuswasnotcorrect.Similarly, 1Excludesdata-definedpersonrecordstemporarilyremovedfromthecensus.SectionIChapter22-5AccuracyandCoverageEvaluationOverviewU.S.CensusBureau,Census2000 E-samplenonmatchcasesweresubjecttofollow-upinter-viewstodetermineiftheywerecorrectlyorerroneously enumeratedintheblockcluster.Possiblematcheswere interviewedtoresolvetheirmatchstatus.Therewerealso othercasessenttofollow-up,suchasmatchedpeople withunresolvedresidencestatusandothertypesofcases consideredtohavethepotentialforgeographicerrorsin thePsample.Thepersonfollow-upinterviewuseda paperquestionnaire.Interviewersgatheredinformation thatpermittedeachpersontobecodedasamatched resident/nonresidentoranonmatchedresident/ | |||
nonresidentoftheblockclusteronCensusDay.Therewas considerableemphasisonobtainingaknowledgeable respondentbeforethefollow-upquestionswereasked. | |||
Afterthefollow-upinterviewwascompleted,theresults werereviewedbyclerkswhoassignedfinalstatustothese casesusinganautomatedsystem.Activity10:MissingDataProcessingTiming:December,2000throughtheearlypartofJanuary,2001.SincetheresultsofthematchingoperationweretobeusedintheestimationphaseoftheA.C.E.,itwasneces-sarytodeterminethematch,correctenumerationand residencestatusofallsamplecases.Whenthesecouldnotberesolvedthroughcomputerandclericalmatchingorthroughfieldfollow-upinterviews,thematch,correctenu-meration,orresidenceprobabilitieswereimputedbased onthedistributionofoutcomesoftheresolvedfollow-upinterviews.Also,asinthecensus,somerespondentsdidnotanswerallthequestionsintheA.C.E.interviewwhich wereneededforestimation.Ifthevariablestenure,sex,race,Hispanicorigin,oragewereblankforP-sampleindi-viduals,themissinginformationwasimputedbasedon thedistributionofthevariablewithinthehousehold,the overalldistributionofthevariable,orusinghot-deckmethods,dependingonthevariable.Imputationformiss-inginformationintheEsamplewasresolvedinthecensus processing.Finally,anoninterviewadjustmentwasmadetoaccountfortheweightsofhouseholdsthatshouldhavebeeninterviewedinA.C.E.,butwerenot.Activity11:DualSystemEstimationTiming:LateJanuary,2001.Dualsystemestimationwasusedtoestimatethenetundercountorovercountofthehouseholdpopulation includedinthecensus.CoverageestimatesofpersonslivingingroupquartersorinRemoteAlaskaareaswerenotmade.Thetermdualsystemestimationisusedbecausedatafromtwoindependentsystemsarecombinedtomeasurethesamepopulation.Aftermatchingtothecensus,thePsamplewasusedtomeasuretheomissionrateinthe census.TheEsamplewasusedtomeasuretheerroneousenumerationrateinthecensus.Thedualsystemestimatorassumesthatallpersonshavethesameprobabilityof beingcapturedinthecensus.Thisisobviouslyanover-simplificationoftheexistingsituation.Post-stratification sharplyreducedthelikelihoodthatthisassumptionwould biastheresults,sinceitonlyrequiresequalcaptureprob-abilitieswithinpost-strata. | |||
Post-stratification.Dualsystemestimationwasusedtocalculatetheproportionofpersonsmissedineachofanumberofrelativelyhomogeneouspopulationgroups calledpost-strata.Thepost-stratafortheCensus2000A.C.E.weredefinedbythevariables:race/Hispanicorigindomain,age/sex,tenure,censusregion,metropolitan statisticalareasize/typeofenumerationarea,andcensusreturnrate.Acompletecross-classificationofthesevari-ableswouldhaveunnecessarilyincreasedthevariancesof theestimatesduetosmallexpectedsamplesizesinmanyofthepost-strata.Consequently,manyofthedetailedcellswerecombined.IntheUnitedStates,therewere448 potentialpost-stratawhichwerecollapsedto416post-strataonthebasisofsmallobservedsamplesizesorhighcoefficientsofvariation.Thedualsystemestimate.Thedualsystemestimate(DSE)foreachpost-stratumwasdefinedby: | |||
DSEDDCE N eN p MwhereDDwasthenumberofdata-definedpersonsinthecensusatthetimeofA.C.E.matching, 2CEwastheweightedestimateofthenumberofpeopleinthecensus whowerecorrectlyenumerated,N ewastheweightedestimateofthenumberofpeopleinthecensus,N p wastheweightedestimateofthenumberofpeoplefoundbytheindependentA.C.E.collectionprocedures,andMwas theweightedestimateofthenumberofpersonsfoundby theindependentA.C.E.collectionprocedureswhowerematchedtopersonsenumeratedinthecensus.Activity12:Model-BasedEstimationforSmall AreasTiming:February,2001. | |||
Activities1through11weredesignedtoprovideesti-matesofnetcoverageforCensus2000.Theseestimates canservetwopurposes.Onepurposewastoprovideinformationonthequalityofthecensussothatanalystscanmakemoreintelligentuseofthedata,andtohelpthe CensusBureauimproveproceduresforfuturecensuses.Thesecondpurposewastohaveabasisforadjustingthecensuscountsfornetcoverage,ifdeemedappropriate. | |||
ThesamplesizesusedintheA.C.E.providedadequate 2Thedata-definedpersonstermexcludescasestemporarilyremovedfromthecensus.2-6SectionIChapter2AccuracyandCoverageEvaluationOverviewU.S.CensusBureau,Census2000 reliabilityforsuchestimatesfortheU.S.asawhole,andformajorgeographicalareas.However,thesamplesizes weretoosmalltoprovidereliableestimatesformost states,counties,cities,andthethousandsofothermunici-palitiesthatnormallymakeuseofcensusdata.Asa result,model-basedestimationwasusedintheseareas.Model-basedestimationtreatsthecoveragecorrectionfactorsasuniformwithinagivenpost-stratum.Another wayofsayingthisisthatthecoverageerrorrateforagivenpost-stratumisassumedtobethesamewithinallgeographicareas.Thisassumptionisobviouslyanover-simplification,andsmallerrorsareintroduced.However,themodel-basedestimatesprovideaconsistentsetofesti-matesinwhichthesumofthepopulationcountsforsmall areasareequaltothedualsystemestimatesofmuchlargerareas(e.g.,theU.S.total,regions,etc.).Coveragecorrectionfactorswereobtainedbydividingthedualsystemestimatesbythecensuscountsofpersonsinhousingunits.Personsingroupquarterswerenotadjustedfornetcoverage.Coveragecorrectionfactorsfor populationgroupsthatgenerallyhadgoodcoveragewere closeto1.00.Populationgroupswithpoorcoveragehadcoveragecorrectionfactorshigherthan1.00,whilecover-agecorrectionfactorslessthan1.00inapost-stratum occurredwhenerroneousenumerationsratesinthe censusexceededomissionrates.Acoveragecorrectionfactorwascalculatedforeachpost-stratum.Ifapost-stratumwasestimatedtohavemorepersonsthanthecensuscount,withineachblockarandomsampleoftheappropriatesizeofcensuspeopleinthepost-stratumwasselected.Thedataoftheselected peoplewerereplicatedintheirblockswithaweightof+1.Ifapost-stratumwasestimatedtohavefewerpeoplethanthecensuscount,withineachblockarandomsample oftheappropriatesizeofpeopleinthepost-stratumwasselected.Thedataoftheselectedpeoplewerereplicatedintheirblockswithaweightof-1.Underthisprocedure noreporteddataforanyindividualwasremovedfromtheCensus2000datafiles.Acontrolledroundingprocedurewasusedtoproduceinteger-valuedmodel-basedesti-matesatvariousgeographiclevels.Estimatesweremadeatvariouslevelsbyaggregatingthedatafromtheappropriateblocksand/orpost-strata.SectionIChapter22-7AccuracyandCoverageEvaluationOverviewU.S.CensusBureau,Census2000 | |||
/ | |||
Chapter3.DesignoftheA.C.E.Sample INTRODUCTIONTheA.C.E.sampledesignwasamultiphase,nationalsampleof301,000housingunits.Itsdevelopmentwas heavilyinfluencedbyitsplannedpredecessor,theInte-gratedCoverageMeasurementsurvey(ICM).InitialplansforCensus2000wereforaone-numbercensuscorrected forcoveragebasedontheICM.AprimarypurposeoftheICMwastoproducedirectstateestimatesofcoveragewithsufficientreliabilityforapportionmentpopulation counts.Thiscalledforastate-baseddesignandamuchlargeroverallnationalsampleof750,000housingunits.TheJanuary1999SupremeCourtrulingagainsttheuseofsamplingforapportionmentresultedinachangeofplansfortheCensus2000coveragesurveyforwhichthepri-marygoalbecametheproductionofreliablenationalcen-suscoverageestimates,andofselectedsub-populations.Thisdidnotrequireaslargeasample.TheA.C.E.sampledesignwasderivedfromtheICMsampledesign.BythetimethechangeofplansfortheCensus2000coveragesurveyoccurred,manyoperational plansfortheICMweretoofaradvancedtomakesignifi-cantchangesrequiredforanewlyconceivedsampledesignplan.Theimplementationplansandsoftwaresys-temsforcreationofthesamplingframeandselectionoftheICMsampleweremovingalongandalmostreadytostart.Muchofthefieldofficeinfrastructureandstaffing wasbeingputinplaceforthefirstfieldoperationunder theICMsampleplan.Itwascriticaltoproceedasplannedinordertomeetschedules.TheA.C.E.samplingplanwasthusdevelopedasamul-tiphasedesign.ThemuchlargerICMsamplewasfirstselected.Fieldstaffcanvassedthesampleareastocreateanindependentaddresslist.Then,usingupdatedmea-suresofsizefromthefieldcanvass,theICMsamplewasre-stratifiedandreducedwithdifferentialprobabilitiesofselectiontocreatetheA.C.E.sampledesign.SectionsontheA.C.E.sampleanditsdesignaredirectedtoageneralaudience.TheyprovideresultsoftheA.C.E.samplealongwithabroadoverviewofthesampledesign.Latersectionsofthischapterprovideamorein-depthdescriptionoftheA.C.E.designandareavailableforread-erswhodesiregreaterdetail.A.C.E.SAMPLEOVERVIEWANDRESULTSTheA.C.E.consistedoftwoparts.ThePopulationSample,Psample,andtheEnumerationSample,Esample,havetraditionallydefinedthesamplesfordualsystemestima-tion.BoththePsampleandtheEsamplemeasuredthe samehouseholdpopulation.However,theP-sampleopera-tionswereconductedindependentofthecensus.TheEsampleconsistsofcensusenumerationsinthesame sampleareasasthePsample.Aftermatchingwiththecensuslistsandreconciliation,thePsampleyieldsanesti-matedrateatwhichthepopulationwasmissedinthecen-suswhereastheEsampleyieldsanestimatedrateat whichenumerationswereerroneouslyincludedinthecen-sus.CombiningthemyieldsanA.C.E.estimateofnetcen-suscoverageofthehouseholdpopulation.TheAccuracyandCoverageEvaluationhadthreesampling phases: 1.First-phasesample.TheselectionoftheICMsample,comprisingalargenumberofsampleareasforwhichalistofhousingunitaddresseswascreatedindependentofthecensus. | |||
2.Second-phasesample.Thereductionofthefirst-phasesamplewhichresultedintheA.C.E.sampleareas.3.Third-phasesample.ThereductionofhousingunitsbysubsamplingwithinunusuallylargeA.C.E.sampleareas.Table3-1summarizestheA.C.E.samplesizeaftereachphaseofsamplingfortheUnitedStates.Thedatesgiveninthetablearetheproductiondates.Thehousingunit countsareapproximate,basedonthebestknowninfor-mationatthetimeoftheparticularsamplingphase.SectionIChapter33-1DesignoftheA.C.E.SampleU.S.CensusBureau,Census2000 Table3-1.Census2000A.C.E.SampleSizesbySamplingPhaseStartandfinishdateSamplingphaseSampleareasEstimatedhousingunitsMarch,1999thruJune,1999First-phase29,1361,989,000December,1999thruFebruary,2000Second-phase11,303844,000 April,2000thruMay,2000Third-phase11,303301,000within-clusterreduction3,153106,000 nowithin-clusterreduction8,150195,000SURVEYCHARACTERISTICSANDTHEA.C.E.SAMPLEDESIGNMainCharacteristicsoftheA.C.E.Sample TheA.C.E.sample:*Isaprobabilitysampleof301,000housingunitsin11,303sampleareasfortheUnitedStates.*YieldsestimatesofnetcensuscoverageofpersonsinhouseholdsandhousingunitsforthenationexcludingRemoteAlaska.*HasindependentsamplesineachstateandtheDistrictofColumbia,buttherearenostate-baseddesign criteria.*Hastotalstatesamplesizesroughlyproportionaltopopulationsizewiththeexceptionthatthesmaller stateshaveadditionalsample;thesesmallerstateshavesimilarsamplesizes.*Usessomedifferentialsamplingwithinstatesforareasthatmaycontributedisproportionatelytototalvariance orhavehigherconcentrationsofhistoricallyunder-countedpopulationgroups.*HasaseparatesampleofAmericanIndianReservationandotherassociatedtrustlands.*Usesupdatedmeasuresofsizeateachphaseof sampling.*Balancesoperationallimitationssuchasfieldworkloadsandstatisticalissuessuchasweightvariation.OverviewoftheDesignTheA.C.E.usesamultiphasesampletomeasurethenetcoverageforthehouseholdpopulationinCensus2000.Thenationalsample,301,000housingunitsin11,303 sampleareas,wasdistributedamongthe50statesandtheDistrictofColumbiaroughlyproportionaltopopula-tionsizeexceptforthesmallerstatesthathadtheir samplesincreased.Primarysamplingunit.TheblockclusterwasthePri-marySamplingUnit(PSU)fortheA.C.E.Eachblockcluster consistedofoneormoregeographicallycontiguouscen-susblocks.Eachblockclustercontainedonaverage30housingunits,whichwasanefficientinterviewerwork-load.Animportantblockclustercharacteristicwaswell-defined,physicalboundaries.Ambiguousblockclusterboundariescouldpotentiallyleadtoerrorsofomissionor erroneousinclusionintheA.C.E.sample.PhasesoftheA.C.E.sample.ThreephasesoftheA.C.E.samplingwere:1.Selectionofaninitialsampleofapproximately30,000blockclustersforwhichthefieldstaffdevelopedan independentlistofhousingunitaddresses.2.Selectionfromtheinitialsampleresultsofasub-sampleofblockclustersfortheA.C.E.samplebasedontheresultsoftheindependentlist.3.Selectionofasubsampleofhousingunitswithinlargeblockclusters.Firstphaseconsistedoftheselectionofasystem-aticsampleineachstate.InthefirstphaseoftheA.C.E.sampling,blockclustersineachstatewereclassi-fiedbysizeintofourmutuallyexclusivegroupsknownas samplingstrata:(1)clusterswith0to2housingunits(smallstratum),(2)clusterswith3to79housingunits(mediumstratum),(3)clusterswith80ormorehousing units(largestratum),and(4)clustersonAmericanIndianReservationswiththreeormorehousingunits(AmericanIndianReservationstratum).Blockclusterswith80or morehousingunitswereselectedwithhigherprobability thanmediumclustersinthisphasebecausehousingunitsinlargeclustersweresubsampledinalateroperation,bringingtheoverallprobabilityofselectiontheinverseof thesamplingweightforhousingunitsintheseclustersmoreinlinewiththeoverallselectionprobabilitiesofhousingunitsinmediumclusters.Withineachsampling stratum,clustersweresortedandasystematicsamplewasselectedwithequalprobability.SecondphaseinvolvedthereductionoftheICMfirst-phasesampletotheleveldesiredforthe | |||
A.C.E.Inthesecondphase,theblockclustersfromthemediumandlargesamplingstratawerere-stratifiedbasedontheestimateddemographiccompositionoftheblock clustersandtherelationshipbetweenthehousingunitcountfromtheindependentlistandtheJanuary2000updatedcensusaddresslist.Thiswasdoneseparatelyfor3-2SectionIChapter3DesignoftheA.C.E.SampleU.S.CensusBureau,Census2000 themediumandlargestratawithineachstate.Thesesub-strataarereferredtoasreductionstrata.Withineach reductionstratum,theclustersweresorted,andasystem-aticsamplewasselectedwithequalprobabilitywithin eachreductionstratum.Thisreductionuseddifferent selectionprobabilitiesacrossthereductionstratawithina stateandacrossstates.Next,usinghousingunitcountsfromtheindependentlistandtheJanuary2000updatedcensusaddresslist,thesmallblockclusterswerestratifiedwithineachstateby size,andsystematicsampleswereselectedfromeachstratumwithequalprobability.Allclustersfromthesmallsamplingstratumwith10ormorehousingunitsbasedon theupdatedinformationwereretained.AllclustersfromthesmallsamplingstratumthatwereonAmericanIndianlandaswellasList/Enumerateclusterswerealsoretained. | |||
ThesecondphaseofsamplingwasnotdonefortheAmericanIndianReservationsamplingstratum.Thethirdphaseconsistedofthesamplereductionofhousingunitswithinlargeblockclusters.InthethirdphaseofA.C.E.sampling,asubsampleofhousingunitswasselectedwithinlargeclusters.Ifaclustercon-tained79orfewerhousingunits,allthehousingunits wereincludedintheA.C.E.sample.Inclusterswith80ormorehousingunits,asubsamplewasselectedtoreducethecostofdatacollection.Thisphaseofsampling resultedinlowervariationofselectionprobabilitiesfor housingunitswithinthesamereductionstratumbecausethelargeclustershadahigherprobabilityofselectionatthefirstphase.Thissubsamplingwasdonebyforming groupsofadjacenthousingunits,calledsegments.Asys-tematicsampleofsegmentswithineachclusterwasselected.Allhousingunitsintheselectedsegmentswere includedintheA.C.E.sample.ThePsampleandtheEsample.ThePsamplecon-sistedofthehouseholdsusedfortheA.C.E.interviewsthatwereconductedintheseselectedblockclustersand blockclustersegments.TheEsamplewasthesetof censusenumerationsinthesesameblockclustersandblockclustersegments.MeasuresofSizeAsstatedearlier,theA.C.E.sampledesignusedupdatedmeasuresofsizeateachphaseofsampling.First-phasesample.Theblockclustermeasureofsizeforthefirst-phasesamplewasbasedonpreliminarycen-susfilesexistinginthespringof1999.Ideally,thesourceoftheblockclustermeasureofsizewouldhavebeentheDecennialMasterAddressFile,thebasefileofcensus addressesforthedecennialprograms.However,thefirstversionofthisfilewasnotavailableuntilthesummerof1999,toolateforuseintheblockclustering.Instead,the first-phasemeasureofsizewastypicallythehigherofthepreliminarycensushousingunitcountorthe1990censusaddresscountforablockclustercontainingcity-style addresses,housenumberandstreetname.Forblockclus-terswithnon-city-styleaddresses,themeasureofsize wasthepreliminary2000censushousingunitcount.The rulesfordeterminingwhichhousingunitsontheprelimi-nary2000censusfileswouldeventuallymoveforwardto theDecennialMasterAddressFilehadnotbeendefined, sotheblockclustermeasureofsizewasbasedonarea-sonablesetofcriteria,butnotthefinalset.Second-phasesample.Forthesecondphaseofsam-pling,theblockclustermeasureofsizewasthecountofhousingunitsonthelistofhousingunitaddressescreated independentlyofthecensusinthefallof1999.Thereduc-tionofthemediumandlargeblockclustersusedapre-liminarycountofthesehousingunits,whichwasaclerical tallyofhousingunitsfromthelistingsheets.Thesmallblockclusterreductionusedthecountofhousingunitsfromtheindependentlistingsheetsaftertheaddresses hadbeenkeyed.Forthemostpart,thepreliminaryand thekeyedcountsforeachblockclusterwereidentical,butforsomeclustersthereweredifferences.Usingaprelimi-narycountwasnecessarybecausethemediumandlarge clusterreductionhadtobecompletedbeforethekeyingoftheindependentlistingsheetswasdone.Third-phasesample.ForthethirdphaseofA.C.E.sam-pling,theblockclustermeasureofsizewasthehousingunitcountresultingfromthehousingunitmatchingand follow-upoperation.Thisoperationconfirmedthecountresultingfromtheindependentlistingandremovedanynonexistentaddressesfromthesamplingframe.FIRSTPHASEOFTHEA.C.E.SAMPLEDESIGNThesampleselectionduringthefirstphaseconsistedofthreemajorsteps:1.Definitionoftheprimarysamplingunits. | |||
2.Stratificationandallocationoftheprimarysamplingunitswithineachstate.3.Selectionoftheprimarysamplingunitswithineach state.DefiningthePrimarySamplingUnitThePrimarySamplingUnits(PSUs)fortheA.C.E.wereblockclusters.ThePSUsweredelineatedinsuchawaythattheyencompasstheentirelandareaoftheUnitedStates,exceptforextremelyremoteareasofAlaska.Each blockclusterconsistedofacensusblockorseveralgeo-graphicallycontiguouscensusblocks.Theycontainedanaverageof30housingunits.ThelandareaforeachPSU wasmadereasonablycompactsoitcouldbetraversedbyaninterviewerinthefieldwithoutincurringunreasonable costs.SectionIChapter33-3DesignoftheA.C.E.SampleU.S.CensusBureau,Census2000 Whytheblockcluster?Abasicdesigndecision,whichwasacontinuationfromthe1990Post-EnumerationSur-vey,wasthatthePSUwouldbeablockcluster,asingle blockoragroupofadjacentblocksestablishedforthe collectionofCensus2000information.Theseblocksmay bestandardcityblocksorirregularlyshapedareaswith identifiablepoliticalorgeographicboundaries.Using blockclustersasPSUs,insteadofcountiesorcounty groupsthataremorecommonlyusedinnationalsurveys, improvedtheprecisionconsiderablywithonlyamodestincreaseincosts.AnalternativesampledesignwasconsideredthatwouldhavedefinedPSUsbysegmentingwholeblocksinto smallercomponents(roughlyone-halfofablock.)Thealternativedesignwouldlikelyhaveresultedinreducedsamplingerror,butwasrejectedbecauseitwouldincrease costs(primarilyduetoincreasedmatchingworkloadsandinterviewertravel)andprobablywouldhaveresultedin matchingerrorsduetoproblemsinidentifying(spatially)thePSUboundaries.Goalsofblockclustering.Blockclusterswereformedtomeetbothstatisticalandoperationalgoals.IntheCen-sus2000DressRehearsal,asmallcensusblockwasby definitionasingleblockcluster.Thisruleledtoalarge numberofsmallblockclustersthatcouldpotentiallyexertundueinfluenceonthefinalpopulationandvarianceesti-mates.OnefeatureofblockclusteringundertheCensus 2000A.C.E.procedurewastocombinesmallcensusblockswithadjacentcensusblocks,iftheneighboringblockcontainedoneormorehousingunits.Thischangein thetreatmentofsmallcensusblockshadanenormous impactonthenumberofsmallblockclusters,whichwasreducedbyapproximately65percentasseeninTable3-2.Still,manyblockclusterscontainedzerohousingunits. | |||
Roughly70percentofthezerohousingunitblocksoccurredinsparselypopulatedareas.Withoutpopulatedneighboringblocks,thesezerohousingunitblocksremainedstand-alonezeroblockclusters.Thetwooperationalgoalsofformingblockclustersweretoincreaselistingefficiencyandtoreducethechanceof listingerror.Thefirstgoalwasmetbycollapsingcensus blockstoproduceblockclustersthatweregeographically compactandwhichaveragedabout30housingunits,a manageableworkload.Thesecondgoalwastocreate blockclustersthatwerewelldefinedtominimizethe chancethattheclusterwouldbelistedincorrectly.For example,alistingerrormayresultwhenacensusblock hasaninvisibleornonphysicalboundarysuchascitylim-itsmakingitunclearwheretheblockboundarywas.Asa result,censusblocksseparatedbyinvisibleboundaries werealwayscombined. | |||
Limitations.Asmentionedearlier,theblockclustermeasureofsizeforthefirstphasewasbasedonprelimi-narycensusaddresscounts.Somecensusoperationsthathelpedbuildthecensusaddresslistwerenotavailableat thetimeblockclusteringstarted.Instead,asnapshotofthebestknowninformationwasused.Thispresentedsomelimitationswiththedatausedforblockclustering.*Addresslimitations:TheresultsoftheBlockCanvassingandLocalUpdateofCensusAddresses(LUCA)opera-tionswerenotincorporatedintothecensusaddresslist intimeforblockclustering.BlockCanvassingwasaCensus2000fieldoperationinmailout/mailbackareas(mostlycity-styleaddresses).TheCensusBureausent staffintothefieldtocanvasstheirassignmentareasandprovideupdatestotheaddresslistsuchascorrec-tions,adds,ordeletes.LocalUpdateofCensus AddresseswasalsoaCensus2000programthatpro-videdanopportunityforlocalandtribalgovernmentstoreviewandupdateaddressinformationinthecensusaddresslist.Table3-2.AccuracyandCoverageEvaluation:BlockClusterSummaryStatistics 1PreliminarynumberofhousingunitsTotal0-23-7980+Numberofcensusblocks 2........................2,969,0004,009,000245,0007,223,000Numberofblockclusters | |||
..........................1,029,0002,486,000252,0003,767,000Numberofblockspercluster 3.....................1.32.21.51.9Numberofhousingunitspercluster................0.329.2181.931.5 1TheUnitedStatesandPuertoRicoareincludedinthesesummarystatistics. | |||
2Countofcensuscollectionblocksbeforeclusteringandbeforeblocksuffixing.DoesnotincludewaterblocksorcensusblocksinRemoteAlaska. | |||
3Thesenumbersarenotthefirstrowdividedbythesecondrow.Theyarethenumberofcensusblocksineachblockclustersizecategorydividedbythenumberofblockclustersineachcategory.Forexample,iftwocensusblockswith40housingunits collapsetoforman80housingunitblockcluster,thosetwocensusblocksarecountedinthe80+categoryforthenumberof blocksperclustercomputation.Blockclusteringcancombineacrosscategories;therefore,thefirstandsecondrowsarenot consistent.3-4SectionIChapter3DesignoftheA.C.E.SampleU.S.CensusBureau,Census2000 | |||
*Geographiclimitations:EachblockinthecensusaddresslisthadaTypeofEnumerationArea(TEA) assignment.ForCensus2000,TEAisaclassification thatidentifiedboththecensusenumerationmethodand themethodusedtocompilethecensusaddresslist.The blockclusteringoperationoccurredconcurrentlywith thecensusreviewofTEAassignmentstoensurethe mostcompletecoverageofthearea.Thisreviewpro-cesssometimeschangedtheTEAassignmentofblocks aftertheblockclusterwasdefined.Onafewoccasions, thisresultedinablockclusterconsistingofblocksthat haddifferentmethodsforcompilingthecensusaddress list.Forexample,ablockclusterconsistedofthree blocks,andallthreeblockshadaTEAassignmentof BlockCanvassingandMailout/Mailbackatthetimeof blockclustering.AfterthecensusTEAreview,oneof thoseblockswasconvertedtoanAddressListingand Update/LeaveTEAassignment.Foracompletelistof TEAsforCensus2000,seetheattachmentorvisit http://www.geo.census.gov/mob/homep/teas.html.Generalrulesfordefiningblockclusters.*BlockclusterswereformedbycombiningneighboringCensus2000blocks.*Blockclustersdidnotcrossspecificgeographicalboundaries.Amongthesewerecounty,interimcensus tract,LocalCensusOffice,TEAgroup,militaryarea,andAmericanIndianCountry.ForTEAgroups,blocksfromcertainTEAscouldbeclusteredtogetheriftheTEAshad thesamemethodforcompilingtheaddresslist.Ameri-canIndianCountryrefers,collectively,tolandsthatareAmericanIndianReservationorothertrustlands,tribal jurisdictionstatisticalareas(nowknownasOklahomaTribalStatisticalAreas),tribaldesignatedstatisticalareas,andAlaskanativevillagestatisticalareas.*Blocksseparatedbyaninvisibleboundary,acityline,forexample,wereclusteredexceptforthesituations describedabove.*Wheneverpossible,smallcensusblocks,thosewithfewerthanthreehousingunits,wereclusteredwith neighboringcensusblockscontaininghousingunitstoreducethetotalnumberofsmallblockclusters.Iftherewerenoneighboringcensusblockwithhousingunits, thesmallcensusblockwasaclusterbyitself.*Topreventblockclustersfrombecomingtoolargewithrespecttohousingunitsize,censusblockswith80ormorehousingunitsweregenerallynotclusteredwithothercensusblocks.*Inadditiontothecriteriaofunitsize,anyblocklargerthan15squaremileswasgenerallyablockclusterby | |||
itself.Theserulesproduced3.8millionblockclusters,abouthalfthe7.2millionnon-suffixedcensusblocks.Theblockclus-tershadanaverageof29.2housingunitspermediumblockclusterandanaverageof31.5overall.Thenumberofsmallblockclustersalsodecreasedfromnearlythree milliontoaboutonemillion,anapproximate65percent reductionfromtheCensus2000DressRehearsalrulesof definingasmallblocktobeaclusterbyitself.However, sinceabout70percentofsmallblocksoccurredinless populatedareaswithlittleornopopulationtocombine, manysinglezero-housingunitblockclusterswereformed.StratifyingandAllocatingthePrimarySampling UnitsStratifyingthefirst-phasesample.Priortosampling,blockclusterswerestratifiedaccordingtotheexpectednumberofhousingunitsandtheAmericanIndianReserva-tion(AIR)statusoftheblockcluster.ThefoursamplingstrataandtheirdefinitionsarepresentedinTable3-3.Allocatingthefirst-phasesample.Asstatedearlier,theCensusBureauwaspreparingtoconducttheICM,amuchlargercoveragemeasurementsurveyof750,000housingunits,whentheuseofsamplingforapportion-mentcountswasdisallowedbytheSupremeCourtinJanuary,1999.Tokeepthecoveragemeasurementsurvey onschedule,theCensusBureauwentaheadwiththeplanstoselecttheICMsampleandcreateindependentaddresslists.Thiswasfollowedbythesubsamplingofthefirst-phasesampletoproducetheA.C.E.sampledesign.Thefirst-phasesamplingplanwasanationalsampleof30,000blockclusters:25,000mediumandlargeblockclustersand5,000smallblockclusters.Includedinthe 25,000blockclusterswasaseparatesampleofblock clustersforAmericanIndianReservations.Itisimportanttopointoutthattheallocationofthe25,000mediumandlargeblockclusterswasdependent ontheICMsampledesignandundertheassumptionofroughly30housingunitsperblockcluster.Theallocationofthe5,000smallblockclusterstothestatesandthe separateAmericanIndianReservationsampletothestateswasdonepriortodefiningblockclustersforallstates,sincethefirst-phasesamplingwasdoneonastate-by-stateflow-basis.Thismeansthatthefirst-phasesamplewasselectedforsomestatesbeforetheblockclustershadbeendefinedforotherstates.Asaresult,weusedthe bestinformationwehadatthetimetocarryoutthe | |||
allocation.Mediumandlargeblockclusters.The25,000mediumandlargeblockclusterswereallocatedtothestatestomeettheICMsamplerequirements(Schindler,1998)withsomeminormodifications.Moststateshad between300to500blockclustersandtheverylargest stateshadanallocationofbetween1,000and2,000block clusters.SectionIChapter33-5DesignoftheA.C.E.SampleU.S.CensusBureau,Census2000 Table3-3.First-PhaseSamplingStrataFirst-phasesamplingstratumDefinitionSmall0to2housingunitsMedium3to79housingunits Large80ormorehousingunits AmericanIndianReservation3ormorehousingunitsandonAmericanIndianReservationsWithineachstate,theblockclustersamplewaspropor-tionallyallocatedtothemediumandlargesamplingstrata basedonthenumberofhousingunitsinthesampling | |||
stratum: c state,kC stateH state,k H statewhere,k=mediumorlargesamplingstratum; c state,k=targetnumberofclustersinsamplingstratumkwithinstate; C state=targetnumberofA.C.E.first-phasemediumandlargesampleclustersfor | |||
state;H state,k=numberofhousingunitsinsamplingstra-tumkwithinstate; H state=numberofhousingunitsinthemediumandlargestratainstate.Asanexample,letssaythat402totalmediumandlargeblockclusterswereallocatedtoaparticularstate.Assum-ingthatthereareanexpected9,000housingunitsinallclustersinthemediumsamplingstratumand12,060housingunitsinboththemediumandlargesampling strata,thetargetnumberofclustersfromthemediumsamplingstratumforthestateiscalculatedasfollows: | |||
Cstate,medium4029,000 12,060300.Thetargetnumberofclustersfromthelargesamplingstratumwouldthenbe102.Smallblockclusters.Becauseofcostconsiderations,smallblockclustersweregenerallysampledatalowerratethaneithermediumorlargeclusters.Anoverallallo-cationof5,000smallblockclusterswaschosenbecausea totalof30,000blockclusterswasdeemedmanageableforcreatingindependentaddresslists.Thehighweightsresultingfromthelowersamplingrateswerenotexpected tohaveaseriousimpactontheestimatesorvariancesformostclustersselectedfromthesmallblockclustersam-plingstratum.However,forclustersthatwereinitiallyclassifiedassmall,butwereobservedtohavealargernumberofhousingunits,therewasconcernabouthighsamplingweightsdisproportionatelycontributingtovari-ance.Inanattempttoavoidtheproblemsassociatedwiththehighweights,alargernumberofsmallclusterswasinitiallyselected,followedbyanindependentaddresslist, followedbyasubsampletoremaininsample.Usingupdatedmeasuresofsizeforthose5,000smallblockclustersinthesmallclusterreductionhelpedtotarget clustersthatcouldhavecontributeddisproportionatelytothevariance.Theseinitial5,000smallclusterswereallo-catedtostatesproportionatelytotheirestimatedtotal numberofhousingunitsinsmallblocks.Ideally,wewouldhaveallocatedthe5,000blockclustersproportionallytostatesbasedonthenumberofsmallblockclustersinthestate.Thiswasnotpossiblebecause thefirst-phasesamplingwasdoneonaflowbasis.AmericanIndianReservationblockclusters. | |||
ToensuresufficientsampleforcalculatingreliablecoverageestimatesforAmericanIndianslivingonreservations,weallocated355blockclusterstoAmericanIndianReserva-tionsnationwide.The355clusterswereallocatedto26 statesproportionaltothe1990populationofAmericanIndianslivingonreservations.SmallblockclustersonAmericanIndianReservationswerenotincludedinthese 355blockclusters.Theseclusterswereeligibleforselec-tioninthesmallclusterstratum.BlockclusterswithinstatescontaininglittleornoAmericanIndianpopulation onreservationswererepresentedinthemediumandlarge strata.Thissampleallocationresultedinvariablefirst-phaseselectionprobabilitiesacrossthestatesdespiteourgoalofhavingproportionalallocationoftheAmericanIndianRes-ervation(AIR)sample.Thisoccurredbecausetheaverage numberofhousingunitsperAmericanIndianReservationblockclustervariedacrossstates.Togetsimilarfirst-phaseselectionprobabilities,weneededtohaveallofthe blockclusteringcompletedbeforeallocatingthesample.However,thefirst-phasesamplingwasdoneonaflow basis.3-6SectionIChapter3DesignoftheA.C.E.SampleU.S.CensusBureau,Census2000 SelectingthePrimarySamplingUnitsWithinEach StateCalculationofthesamplingparameters.Theblockclusterprobabilityofselection(PS)foreachofthefour samplingstrataineachstateistheratioofthetarget samplesizetothenumberofclustersinthestratum.It takesthefollowingform: | |||
PS state,kc state,kL state C state,k ,where, PS state,k=probabilityofselection(samplingrate)insamplingstratumkwithinstate; C state,k=numberofclustersinsamplingstratumkwithinstate; c state,k=targetnumberofclustersinsamplingstra-tumkwithinstate; L state=thefactortoreducethenumberofclusterstoselectforthestate,iftheexpectedlisting workloadexceededtheplanningestimate. | |||
L state{1forsmall,mediumandAIRsamplingstratum0<L1forlargesamplingstratumThelargeblockclustersamplingratewasreducediftheexpectednumberofhousingunitstolistwasgreaterthantheplanningestimateofthelistingworkload.AsecondstepofsamplingwasnecessaryinMissouriandIndiana becausetheselectedsampleofclustersresultedinagreaternumberofhousingunitstolistthanwasexpected.Tomeetoperationalconstraints,asubsampleofthefirst-stepselectedblockclusterswasselected.Thesecondstepofsamplingonlyoccurredinthelargesamplingstratum,sincethatstratumdisproportionatelycontributedtothe listingworkload.Thesecondstepoccurredonlyiftheesti-matednumberofhousingunitsinthemediumandlargestratawasatleasttenpercentlargerthantheplanningestimateofthenumberofhousingunitstobelisted.Forstatesneedingthesecondstepofsampling,thesam-plingratetookthefollowingform: | |||
PS2 statePW state W statewhere, PS2 state=second-stepsamplingrateforthelargesamplingstratuminstate, W state=resultingworkloadestimatefromsampleselectionforthelargesamplingstratuminstate, PW state=planningworkloadestimateforthelargesamplingstratuminstate.SortingthePSUs.Thefirst-phaseclustersweresortedwithineachsamplingstratumasfollows:*AmericanIndianCountryIndicator*Demographic/TenureGroup*1990Urbanization*Countycode*BlockclusteridentificationnumberAlthoughtherewasnodifferentialsamplingwithinthefourfirst-phasesamplingstrata,theclustersweresortedbyseveralvariablesinanattempttoimprovetherepre-sentativenessofthesampleofblockclusters.ThefirstvariablewastheAmericanIndianCountryIndicator,whichseparatedtheblockclustersintothreeAmericanIndian | |||
categories:1.AmericanIndianReservationorothertrustland,2.tribaljurisdictionstatisticalarea,Alaskanativevillagestatisticalareaortribaldesignatedstatisticalarea,and3.allotherareas.Thesecondsortvariablewasthedemographic/tenuregroup.Blockclusterscontainingsimilardemographic/tenureproportions,basedon1990censusdata,were grouped.Toaidinselectingasamplethatwaswellrepre-sentedbythesixmajorrace/origingroups,aswellasownersandrenters,blockclusterswereclassifiedinto12 demographic/tenuregroups.Althoughmanyblockclus-terstendtohavealargeproportionofonedemographic/tenuregroup,rarelyweretheyentirelycomposedofonly one,thusmanyclustersfitwellintwoormorecategories. | |||
Toensurethateachclusterwasassignedtoonlyonegroup,ahierarchicalassignmentrulewasdevelopedsothatwhenaclusterexceededthefirstgroupthreshold,it wasassignedtothatgroup.Thesethresholdswerebasedonamultivariateclusteringmethodappliedto1990cen-susblocks.Table3-4liststhesethresholdvalues.Thehier-archygivesthesmallerdemographicgroupspriorityover thelargeronesandrenterspriorityoverowners.Forexample,iftheapproximatedistributionofablockclusterpopulationwas20percentAsianRenter,40percentAsian Owner,and40percentWhiteandotherRenter,thentheblockclusterwasassignedtotheAsianRenterdemographic/tenuregroup.SectionIChapter33-7DesignoftheA.C.E.SampleU.S.CensusBureau,Census2000 Table3-4.Demographic/TenureGroupThresholds(50Statesandthe DistrictofColumbia)OrderDemographic/TenureGroupThreshold1HawaiianandPacificIslanderrenters10%2HawaiianandPacificIslanderowners10% | |||
3AmericanIndianandAlaskaNativerenters10% | |||
4AmericanIndianandAlaskaNativeowners10% | |||
5Asianrenters20% | |||
6Asianowners20%7Hispanicrenters20%8Hispanicowners20% | |||
9Blackrenters25%10Blackowners25%11Whiteandotherrenters30%12AllothersallothersAthirdsortvariablewastheestimatedlevelofurbaniza-tionbasedon1990dataforeachblockcluster.Eachblockclusterwascategorizedeitherasanurbanizedareawith250,000ormorepeople,anurbanizedareawithlessthan 250,000people,oranon-urbanarea.Andfinally,theclus-tersweresortedgeographicallyusingcountyandclusternumber.Generalsamplingprocedure.Asystematicsampleofblockclusterswasselectedfromeachsamplingstratumwitheachblockclusterhavingthesameprobabilityofselectionwithinasamplingstratum.Themethodusedtoselectsystematicsamplesfollows:1.SamplingunitsweresortedusingthePSUsortcriteriadescribedateachsamplingphase.2.EachsuccessivePSUwasassignedanindexnumber1throughNwithineachsamplingstratumwhereNis thenumberofPSUsinthestratum.3.Arandomnumber(RN)betweenzeroandone,0<RN1,wasgenerated.4.Arandomstart(RS)forthesamplingstratumwascal-culated.Therandomstartwastherandomnumbermultipliedbytheinverseoftheprobabilityofselec-tion,RS=RN1/PS,suchthat0<RS1/PS.5.Samplingsequencenumberswerecalculated.GivenNPSUs,sequencenumberswere:RS,RS+1(1/PS),RS+2 x(1/PS),...,RS+n(1/PS)wherenwasthelargestintegersuchthat | |||
[RS+(n-1)1/PS]N.Sequencenumberswereroundeduptothenextinteger.Anintegernumberroundedtoitself.6.SamplingsequencenumberswerecomparedtotheindexnumbersassignedtoPSUs.ThePSUwiththe indexnumbercorrespondingtotheroundedsequencenumberwasselected.AllPSUswithoutcorrespondingindexnumberswerenotinsample.First-PhaseSampleResultsTable3-5liststheblockclustersamplesizesandthenum-berofhousingunitsbysamplingstratumforeachstate,theDistrictofColumbia,andthenation.3-8SectionIChapter3DesignoftheA.C.E.SampleU.S.CensusBureau,Census2000 Table3-5.StateFirst-PhaseSampleResultsbyFirst-PhaseStratum StateFirst-phasehousingunits 1First-phaseblockclustersSmallMediumLargeAIRTotalSmallMediumLargeAIRTotal Alabama......................607,90019,000026,9601162861090511 Alaska........................205,20023,2002028,440201901371348 Arizona.......................207,80044,7002,60055,12086269180113648 Arkansas.....................409,60015,900025,540903531010544 California.....................5045,000227,600230272,8801841,4421,311112,948 Colorado......................208,00025,6006033,680832931572535 Connecticut...................106,10025,600031,710202111590390 Delaware.....................207,20028,700035,920202431560419DistrictofColumbia............104,80050,500055,310201322470399Florida........................507,50050,1003057,6801452592301635 Georgia.......................706,10030,300036,4701542201620536 Hawaii........................103,00042,400045,410201031610284 Idaho.........................108,20010,90014019,25054312756447Illinois........................1008,60022,300031,0001852811400606Indiana.......................806,1009,700015,880140202510393 Iowa.........................1206,8009,500016,420147242530442Kansas.......................1106,40011,1003017,640193237631494 Kentucky.....................607,20022,300029,560962681350499 Louisiana.....................1011,30024,900036,210654071550627 Maine........................205,80011,0001016,83038226791344 Maryland.....................205,30038,000043,320361771750388 Massachusetts................206,40022,000028,420382291400407Michigan......................507,90015,10015023,2001222681045499Minnesota....................706,00014,00027020,3401412088310442 Mississippi....................408,40011,70012020,26081303773464 Missouri......................1105,70014,500020,310162200710433 Montana......................108,4009,70084018,950673336724491Nebraska.....................806,8007,7007014,650142245553445 Nevada.......................106,40057,80019064,400462252305506NewHampshire...............205,70015,400021,120252011060332NewJersey...................108,70030,100038,810392821780499NewMexico...................109,30024,8001,64035,75010833513670649NewYork.....................8017,600124,70070142,45014360363151,382NorthCarolina.................1006,70020,7008027,5801432361214504NorthDakota..................1005,9009,10034015,4401212366412433 Ohio.........................1107,80024,000031,9101322681330533 Oklahoma.....................609,00017,30027026,6301423141018565Oregon.......................105,20015,4007020,68086195903374 Pennsylvania..................11012,90022,600035,6101804271460753RhodeIsland..................107,60018,000025,610202561080384SouthCarolina................408,20019,100027,340952851120492SouthDakota.................505,8009,20045015,5001062425727432Tennessee....................907,80025,400033,2901332851370555Texas........................7034,700148,50030183,3003491,22268112,253 Utah.........................109,10023,90012033,130383121447501Vermont......................205,60012,000017,62021201880310Virginia.......................605,60031,900037,56098961660460Washington...................205,60021,40048027,5007318712017397WestVirginia..................305,00013,100018,13046189790314Wisconsin.....................806,2008,20022014,7001192115810398Wyoming.....................108,7009,2009018,00072346695492TotalU.S......................2,400438,6001,539,8008,6201,989,4205,00015,3938,38835529,136 1Preliminarycensusaddresslisthousingunitcountsfromspring1999.SECONDPHASEOFTHEA.C.E.SAMPLEDESIGNThesecondphase,oftenreferredtoastheA.C.E.reduc-tionphase,linkedthefirst-phasesampleselectiontotheA.C.E.samplingplan.TheA.C.E.reductionwasthefirstofseveraloperationsthatreducedthenumberofhousing unitsfromthenearlytwomillionhousingunitsintheindependentlistingtotheapproximately300,000housingunitsthatweresentforinterview.Sincenotallofthefirst-phaseblockclusterswererequiredforA.C.E.,thereduc-tionsubsampledthoseclusters,withtheselectedclusters retainedfortheA.C.E.operations.FollowingtheselectionoftheA.C.E.first-phasesample,fieldstaffvisitedtheblockclustersandcreatedaninde-pendentaddresslistforA.C.E.TheseupdatedhousingSectionIChapter33-9DesignoftheA.C.E.SampleU.S.CensusBureau,Census2000 unitcountswereusedintheclustersubsamplingphase.Theclustersubsamplingwasdoneseparatelyfor:*mediumandlargeclusterreduction,and | |||
*smallblockclusterreduction.MediumandLargeClusterReductionThemediumandlargeclusterreductionwasthetransitiontotheA.C.E.samplingplan.Theresultingnationalsampleallocationwasroughlyproportionaltostatepopulationwithsomedifferentialsamplingwithinstates.Onlyblock clustersfromthemediumandlargefirst-phasesamplingstratainthe50statesandtheDistrictofColumbiaweresubsampledinthisphase.Aspartofthesamplereduc-tion,twootherobjectivesoftheA.C.E.samplewereimple-mented.Oneobjectiveofthemediumandlargeclusterreductiondesignwastostratifythefirst-phaseclustersbasedontherelationshipofcurrenthousingunitcountsfromtheA.C.E.independentlistingandtheupdatedcensusaddresslistas ofJanuary,2000.Clustersweresampledwithdifferentselectionprobabilitiesinordertoreducethevariancecon-tributionduetoinconsistenthousingunitcountsbetween theupdatedcensuslistandtheindependentlist.Clusters withsignificantdifferencesbetweenthecountswereexpectedtohavehigherroneousenumerationandhighomissionrates.Theobjectiveofdifferentiallysampling thesetypesofclusterswastoreducethesamplingweightsassociatedwithclustershavingrelativelyhighnumbersofmissedpersonsorthoseenumeratedinerror, and,thus,havingpotentiallyhighvariancecontributions.Asecondobjectiveofthemediumandlargeclusterreduc-tiondesignwastodifferentiallysampleclustersbasedon theestimateddemographiccompositionofthecluster.ClusterswithahighproportionofpersonsofHispanicori-ginorpersonsbelongingtoacensusracegroupother thanWhitewereclassifiedintoaminoritystratum.These typesofclustersweresampledatahigherratethanpre-dominantlynon-HispanicWhiteclusters,inordertoincreasethesamplesizeandimprovethereliabilityofthe A.C.E.populationestimatesforthesehistoricallyunder-countedsubgroups.Stratifyingsecond-phaseclusters.Eachblockclusterwasputintotwocategoriesforthemediumandlargeclusterreduction:ademographicgroupandaconsistencygroup.Blockclusterswereputintoreductionstratabased onthecombinationofthesetwogroups.Demographicgroupswerebasedonthedemographic/tenuregroupscreatedinthefirst-phasesampleselection.Thedemographic/tenuregroupsrepresentedaclassifica-tionofblockclusters,usingtheinformationofrace/ | |||
Hispanicoriginandtenureofeachblockreportedinthe 1990census.Thedemographic/tenuregroupswereusedasasortvariableintheselectionofthefirst-phasesample.Forthisreduction,clusterswereputintotwo demographicgroupsbycombiningthe12demographic/tenuregroupsinTable3-4.Thetwodemographicgroupsare:1.Minority:blockclustersfromoneofthetenminoritydemographic/tenuregroups2.Non-minority:blockclustersfromoneofthetwootherdemographic/tenuregroupsForthisreduction,twoupdatedclusterhousingunitcountswereused:theindependentlistinghousingunitcountandthehousingunitcountfromtheupdatedcensusaddresslistasofJanuary2000.Thetwohousingunit countswerecompared,andclusterswereplacedintocon-sistencygroupsbasedontherelationshipofthehousingunitcounts.Largedifferencesbetweenthecountsindi-catedthatcoverageproblemsmightoccur;thus,thesam-plingweightsforsuchclusterswerecontrolledtoavoidseriousvarianceeffects.ClusterswereplacedintothreeconsistencygroupsasshowninTable3-6.Table3-6.Second-PhaseSamplingConsistency GroupsRelationshipConsistencygroupIndependentlistisatleast25percentlowerthan census......................................Low inconsistentIndependentlistisatleast25percentgreaterthan census......................................HighinconsistentIndependentlistiswithin | |||
+/-25percentofcensus..ConsistentForList/Enumerateclusters(seeattachment),thecensushousingunitcountwasnotknownatthetime ofthereductionsincethiscensusoperationhadnotstarted.Thus,allsuchclusterswereclassifiedashigh inconsistent.3-10SectionIChapter3DesignoftheA.C.E.SampleU.S.CensusBureau,Census2000 Basedonthedemographicgroup,theconsistencygroup,andtheindependentlistinghousingunitcount,block clusterswereassignedtooneoffivereductionstrata:1.Minority(lowinconsistent,highinconsistent, consistent)2.Non-minoritylowinconsistent 3.Non-minorityhighinconsistent4.Non-minorityconsistent 5.MediumstratumjumperMediumstratumjumperclusterswereselectedfromthemediumsamplingstratumforthefirst-phasesample,but had80ormoreindependentlistinghousingunits.Mediumclustersweresampledatlowerratesthanlargeclustersinthefirst-phasesamplesincelargeclusters eventuallyweretoundergowithin-clusterhousingunitsubsampling,anoperationthatincreasessamplingweights.Mediumstratumjumperclustersalsowent throughwithin-clusterhousingunitsubsampling,meaningthealreadyhighersamplingweightsoftheseclustersbecameevenlarger.Retainingallofthemediumstratum jumperclustersinthisreductionavoidedintroducingsig-nificantweightvariationinthesample.Allocatingsampletostrata.Thefirststepwastoallo-catethenationalsampleof300,000housingunitstothe 50statesandtheDistrictofColumbia,inmostcasespro-portionalto1998populationestimates,withaminimumof1,800housingunitsineachstate.Hawaiiwasallocated approximately3,750housingunitsduetoitsconcentra-tionofHawaiianandPacificIslandersforwhichseparatepopulationcoverageestimateswereplanned.Withineachstate,thesecond-phaseselectionprobabilitiesvariedsomewhatamongthestrata.First,allclustersinthe mediumstratumjumperreductionstratumwereretained.Fortheremainingfourreductionstrata,higherretentionrateswereusedintheminority,non-minoritylowincon-sistentandthenon-minorityhighinconsistentreductionstratathanthenon-minorityconsistentstratum.Thestra-tumdifferentialsamplingfactoristheratiooftheprob-abilityofselectionforthestratumtotheprobabilityof selectionfortheconsistentstratum.Thefollowingstatementsdescribehowthestratumdiffer-entialsamplingfactorsweresettoyieldtheoverallstate samplesize.Thesearenotexactrules,butgiveasenseof howmuchdifferentialsamplingwithinstateswasdone.*Themaximumexpectedsamplingweightafterallsub-sampling,theinverseoftheoverallprobabilityofselec-tion,was650forthenon-minorityconsistentreduction stratum.*Themaximumdifferentialsamplingfactorwas3forthetwoinconsistentreductionstrata.*Thedifferentialsamplingfactorwasaround2fortheminorityreductionstratum,exceptinsmallstates wherealloftheminorityclusterswereretained.Thedifferentialsamplingfactorswereassignedusingguidelinesdesignedtoachievethetwoobjectivesofthereduction,whilealsocontrollingthesizeofthesamplingweightsandtheamountofdifferentialsampling.Thisled tothedesignofthedifferentialsamplingfactorssumma-rizedinTable3-7.Usingthestratumdifferentialsamplingfactorsandtheestimatednumberofhousingunits,thesampleallocationforeachreductionstratumwasderivedasfollows: | |||
T gTDSF gHgg1 4 DSF gHgwhere,g=A.C.E.second-phasesamplingstratum, T g=Targetnumberofsamplehousingunitsallocatedtoreductionstratumg,T=Statetargetnumberofsamplehousingunitsmodifiedformediumstratumjumper clusters, Hg=Estimatednumberofhousingunitsinthereductionstratumbasedontheindepen-dentlistinghousingunitcounts,and DSF g=DifferentialSamplingFactorforreductionstratumg.SectionIChapter33-11DesignoftheA.C.E.SampleU.S.CensusBureau,Census2000 Table3-7.A.C.E.Second-PhaseSampleDesignParametersforLargeandMediumClusters StateDifferentialSamplingFactors 1Target sample size 6 First-phase sample size 7 Minority 2 Low inconsistent 3 High inconsistent 4 Consistent 5 Alabama....................................1.781.781.781.004,47026,960 Alaska......................................6.203.003.001.001,80028,440 Arizona.....................................1.781.781.781.004,80055,120 Arkansas...................................2.003.003.001.002,61025,540 California | |||
...................................2.003.003.001.0033,510272,880Colorado.. | |||
.................................1.992.932.931.004,08033,680 Connecticut | |||
.................................2.003.003.001.003,36031,710 Delaware...................................2.913.003.001.001,80035,920DistrictofColumbia.. | |||
........................2.003.003.001.001,80055,310 Florida.....................................1.001.001.001.0015,30057,680 Georgia....................................2.012.012.011.007,83063,470 Hawaii......................................2.003.003.001.003,75045,410 Idaho.......................................2.713.003.001.001,80019,250 Illinois......................................1.191.191.191.0012,36031,000 Indiana.....................................1.681.681.681.006,06015,880Iowa.......................................2.003.003.001.002,94016,420 Kansas.....................................2.003.003.001.002,70017,640 Kentucky...................................2.003.003.001.004,05029,560 Louisiana...................................1.893.003.001.004,47036,210 Maine......................................6.553.003.001.001,80016,830Maryland.. | |||
.................................1.872.462.461.005,28043,320 Massachusetts | |||
..............................2.332.332.331.006,30028,420 Michigan....................................1.251.251.251.0010,08023,200 Minnesota..................................2.112.112.111.004,86020,340 Mississippi | |||
..................................1.962.832.831.002,82020,260 Missouri....................................2.252.252.251.005,58020,310Montana....................................1.573.003.001.001,80018,950 Nebraska...................................2.443.003.001.001,80014,650 Nevada.....................................1.952.762.761.001,80064,400NewHampshire | |||
.............................6.843.003.001.001,80021,120NewJersey.................................2.242.242.241.008,34038,810NewMexico.................................1.731.731.731.001,80035,750NewYork...................................2.003.003.001.0018,660142,450NorthCarolina | |||
...............................1.831.831.831.007,74027,580NorthDakota | |||
................................2.143.003.001.001,80015,440Ohio.......................................1.221.221.221.0011,49031,910 Oklahoma..................................2.003.003.001.003,42026,630 Oregon.....................................1.942.762.761.003,36020,680Pennsylvania.. | |||
..............................1.701.701.701.0012,30035,610RhodeIsland.. | |||
..............................2.943.003.001.001,80025,610SouthCarolina | |||
..............................1.601.601.601.003,93027,340SouthDakota | |||
...............................1.833.003.001.001,80015,500Tennessee..................................1.992.862.861.005,58033,290Texas......................................1.862.362.361.0020,280183,300Utah.......................................2.003.003.001.002,16033,130Vermont....................................6.913.003.001.001,80017,620Virginia... | |||
..................................1.901.901.901.006,96037,560Washington | |||
.................................2.232.232.231.005,85027,500WestVirginia | |||
................................2.003.003.001.001,86018,130 Wisconsin..................................1.751.751.751.005,37014,700Wyoming... | |||
................................1.993.003.001.001,80018,000 1Theobservedoractualsamplingfactorsdifferedfromthedesignsamplerates.SeethesectiononSelectingasubsample. | |||
2Clusterswithhighconcentrationsofminorities. | |||
3Clusterswheretheindependentlistinghousingunitcountisatleast25percentlowerthantheupdatedcensuslistcount. | |||
4Clusterswheretheindependentlistingcountisatleast25percenthigherthantheupdatedcensuslist. | |||
5Clusterswheretheindependentlistingcountandtheupdatedcensuslistdonotdifferbymorethan25percent. | |||
6Targetstatehousingunitinterviewsamplesize,excludingAmericanIndianReservationsample. | |||
7First-phasepreliminarycensusaddresslisthousingunitcountsfromSpring,1999.3-12SectionIChapter3DesignoftheA.C.E.SampleU.S.CensusBureau,Census2000 SortingthePSUs.Thefirst-phaseclusterswithineachsecond-phasestratumbyfirst-phasesamplingstratum weresortedasfollows:*Consistencygroup | |||
*List/Enumerateindicator | |||
*AmericanIndianCountryIndicator | |||
*Demographic/TenureGroup | |||
*1990Urbanization | |||
*Countycode | |||
*BlockclusteridentificationnumberSelectingasubsample.Sincethefirst-phasesampleutilizeddifferentsamplingratesforthemediumandlargesamplingstrata,separatesamplesweredrawnforeach second-phasestratumwithinthefirst-phasesamplingstrata.Selectingthesamplerequiredcalculatingthesam-plingrates,sortingtheclusters,anddrawingasystematic sampleofclusters.Allofthemediumstratumjumperswereretainedinthesample.Thesamplingratesfortheremainingfourreduc-tionstratawerecomputedsothatanintegernumberofblockclusterswasselected.Thisrequiredcomputingasamplingratebasedontheratioofhousingunitswhich resultedinanon-integerexpectednumberofclusters,determininganintegernumberofclusterstoselect,andcalculatingthefinalsamplingratebasedontheratioof clusters.Themediumandlargeclusterreductionfollowed thesamplingprocedurediscussedearlier.Thisresultedinatotalof9,765outof24,136mediumandlargeclustersretainedintheA.C.E.sampleforthe50 statesandtheDistrictofColumbia.Mediumandlargeclusterreductionsampleresults.Table3-8liststhenumberofhousingunitsandclustersin sample.SmallClusterReductionThefirst-phasesamplecontained5,000smallclustersintheUnitedStates.Smallclusterswereexpectedtohave betweenzeroandtwohousingunitsbasedonanearly censusaddresslist.Conductinginterviewingand follow-upoperationsinclustersofthissizewasnotas costeffectiveasinlargerclusters.Therefore,toallocate A.C.E.resourcesmoreefficiently,onlyasubsampleof thesesmallclusterswasretainedintheA.C.E.sample.Thissubsamplingoperationattemptedabalanceamongthreegoals.Onegoalwastopreventanysmallclusters fromhavingsamplingweightsthatwereextremelyhighcomparedtootherclustersinthesample.Second,sam-plingweightsshouldbeloweronclusterswherethenum-berofhousingunitswasdifferentthanexpected.These firsttwogoalsattemptedtoreducethecontributionofsmallclusterstothevarianceofthedualsystemesti-mates.Thethirdgoalwastoimproveoperationaleffi-ciencybyreducingthenumberofclustersandfuturefieldvisits.Toachievethesegoals,differentialsamplingwas used.Stratifyingfirst-phaseclusters.Thefirst-phasesmallclusterswereclassifiedintoninepossiblereductionstrata withineachstate.Thesestrataweredefinedusingthree clustercharacteristics:Size,AmericanIndianCountrysta-tus,andList/Enumeratestatus.Thesizeofaclusterwasbasedonthegreateroftheinde-pendentlistinghousingunitcountortheupdatedcensusaddresslisthousingunitcountasofJanuary2000.ForList/Enumerateclustersthesizewasalwaysbasedonthe actualindependentlistingcountsincetheList/Enumerateoperationhadnotyetstartedbythetimeofthisreduc-tion.TheAmericanIndianCountrystatushadthreecat-egoriesasdescribedinthefirst-phaseofsampling.Table3-9containsthereductionstrataforsmallblock clusters.Table3-8.Second-PhaseResultsMediumandLargeBlockClusterandHousingUnitCountsNumberof.... | |||
Minority Low inconsistent HighinconsistentConsistent Stratum jumpers American IndianreservationsNationHousingunits 1..........................230,52949,08694,850403,80632,0649,251819,586Clusters... | |||
.............................2,5539718424,8012433559,765 1IndependentListingcountsasofDecember,1999.SectionIChapter33-13DesignoftheA.C.E.SampleU.S.CensusBureau,Census2000 Table3-9.SmallBlockClusterSecond-PhaseStrata Second-phase stratum Housing units AmericanIndiancountry List/Enumerate status10to2NoNo23to5NoNo 36to9NoNo 410+--50to2NoYes63to9NoYes 70to9Reservation/Trustland-80to2TJSA/TDSA/ANVSA 1-93to9TJSA/TDSA/ANVSA-1TribalJurisdictionStatisticalArea/TribalDesignatedStatisticalArea/AlaskaNativeVillageStatisticalAreaDeterminingtargetsamplingrates.Usingindepen-dentlistinghousingunitcounts,targetsamplingrates weredetermined.Theseratesattemptedtosatisfythepre-viouslydiscussedstatisticalandoperationalgoals.Generally,thesmallclusterswerestratifiedintofourgroupsbasedonthenumberofhousingunitsintheclus-ter.Allclusterswithtenormorehousingunits,onAmeri-canIndianland,orclassifiedasList/Enumeratewereretainedinsample.Fortheremainingthreereductionstrata,somedifferentialsamplingwasintroduced.Todeterminethesamplingratesforthesestrata,twocon-ditionswereimposed.Oneoftheseconditionswasthat,if possible,thenumberofweightedhousingunitsinaclus-terdidnotexceed2,400housingunits.Throughcom-putersimulations,anumberofdifferentlimitsweretrieduntilacapof2,400yieldedasampleofappropriatesize. | |||
Thesecondconditionwasaminimumsamplingrate,whichvariedamongthethreestrata.Table3-10containsasummaryofthesamplingconditions.Table3-11illustratestheprocessfordeterminingthesecond-phasesampling rateforeachstratum.Theoveralltargetselectionprobabilitywasbasedonthemaximumnumberofhousingunitswithinastratumand thepreviouslymentionedcapof2,400housingunits.Forexample,themaximumnumberofhousingunitsinstra-tumgrouponewastwo.Hence,theoveralltargetselec-tionprobabilitywas1in(2,400/2)or1in1,200.Thesam-plingrateforeachsecond-phasestratumwasthensetattheraterequiredtoattaintheseoveralltargetprobabilitiesofselection.SortingthePSUs.Thefirst-phaseclustersweresortedinthefollowingorderineachsecond-phasestratum:*1990urbanization*countycode | |||
*A.C.E.clusteridentificationnumberTable3-10.SmallClusterReductionSamplingConditions Second-phase stratum Clustersize(HUs)OveralltargetselectionprobabilityMinimumsecond-phasesamplingrate10to21/1,2001/1023to51/4801/4 36to91/2671/2.22Table3-11.Second-PhaseSamplingRateCriterionIf...Then,thesecond-phasesamplingrateequals...OveralltargetselectionprobabilityMinimumsecond-phasesamplingrateFirst-phasesamplingrateOveralltargetselectionprobabilityFirst-phasesamplingrateOveralltargetselectionprobability<Minimumsecond-phasesamplingrateFirst-phasesamplingrateMinimumsecond-phasesamplingrate3-14SectionIChapter3DesignoftheA.C.E.SampleU.S.CensusBureau,Census2000 Selectingasubsample.Separatesampleswereselectedfromeachsecond-phasestratumwithineachstateandthe DistrictofColumbia.Thisrequiredcalculatingtheactual samplingrateforthestratum,sortingtheclustersand drawingasystematicsampleofclusters.Allclusterswith10ormorehousingunitsthatwereclas-sifiedasList/Enumerate,orwereinAmericanIndianCoun-try,wereretainedinsample.Thesamplingratesforthe remainingthreestratawerecomputedtoachieveaninte-gernumberofblockclustersdrawnfromeachstratum,similartoproceduresusedforthemediumandlargeclus-terreduction.Thisrequiredcomputingasamplingrate, whichresultedinanonintegerexpectednumberofclus-tersdetermininganintegernumberofclusterstoselect, andcalculatingthefinalsamplingratebasedontheratioofclusters.Thesmallclusterreductionfollowedthesam-plingprocedurediscussedearlier.Thisresultedinatotalof1,538outof5,000smallclus-tersretainedintheA.C.E.sampleforthe50statesandthe DistrictofColumbia.Smallclusterreductionresults.Table3-12givesthedistributionofblockclustersandhousingunitsaftersmallblockclusterreduction.Asmentionedearlier,thelargeroftheindependentlistinghousingunitcountandthehous-ingunitcountfromtheupdatedcensusaddresslistasofJanuary2000wasusedtostratifytheclusters.InTable3-12,onlytheindependentlistinghousingunitcountis usedinthesetallies.Hence,with55clusters,asseeninthe6-9clustersize,thenumberofhousingunitsdoesnotachievetheminimumof330.Second-PhaseSamplingResultsTable3-13liststheblockclustersamplesizesandthenumberofhousingunitsineachstate,the DistrictofColumbia,andthenationafterthesecondphaseofA.C.E.sampling.Table3-12.Second-PhaseResultsSmallBlockClusterandHousingUnitCountsClustersize (HUs)1AmericanIndian country List/enumerate statusNumberofhousingunits 2Numberof clusters0-2NoNo2096923-5NoNo3581176-9NoNo3255510+--4,5321120-2NoYes59290 3-9NoYes7616 0-9Reservation/Trustland-43128 0-2TJSA/TDSA/ANVSA 3-401213-9TJSA/TDSA/ANVSA-307Total5,6721,538 1ThesizeofaclusterwasbasedonthehigheroftheindependentlistinghousingunitcountortheJanuary,2000censusaddresslist.ForList/EnumerateclustersthesizewasalwaysbasedontheactualindependentlistingcountsincetheList/Enumerateoperationhadnotyetbeenstartedbythetimeofthisreduction. | |||
2KeyedindependentlistinghousingunitcountsasofJanuary,2000. | |||
3TribalJurisdictionStatisticalArea/TribalDesignatedStatisticalArea/AlaskaNativeVillageStatisticalArea.SectionIChapter33-15DesignoftheA.C.E.SampleU.S.CensusBureau,Census2000 Table3-13.StateSecond-PhaseSampleResultsbyFirst-PhaseStratum StateSecond-phasehousingunits 1Second-phaseblockclustersSmallMediumLargeAIRTotalSmallMediumLargeAIRTotal Alabama......................543,5997,531011,18414104430161 Alaska........................241,4013,099164,54074022170 Arizona.......................1403,08217,1852,82623,233697961113322 Arkansas.....................162,0773,56605,6591371240108 California.....................40119,12477,91320497,64293528469111,101 Colorado......................192,7229,2485212,0412485552166 Connecticut...................371,6996,71808,454559470111 Delaware.....................71,5722,97904,55834023066DistrictofColumbia............01,2515,40306,65422531058Florida........................2657,97654,9862063,247432592301533 Georgia.......................2114,09521,195025,501271381110276 Hawaii........................111,20022,252023,463640750121 Idaho.........................121,6322,7141524,5103253166107Illinois........................517,52720,041027,619252471310403Indiana.......................1254,1417,431011,69724141460211 Iowa.........................1612,3383,70506,2042179220122Kansas.......................332,1933,488315,7452470221117 Kentucky.....................922,3299,621012,0421492520158 Louisiana.....................73,3326,57409,91340109500199 Maine........................381,4472,02013,506245316194 Maryland.....................223,28817,041020,351677820165 Massachusetts................1053,46711,471015,04310120800210Michigan......................646,61213,58114820,40519227925343Minnesota....................793,2107,27528610,850281164910203 Mississippi....................842,4992,957965,6362076253124 Missouri......................2693,22911,558015,05624113510188 Montana......................151,8802,3659055,16541601424139Nebraska.....................251,6851,317913,1183153133100 Nevada.......................11,3618,50620410,0723828305101NewHampshire...............501,6582,53504,243114619076NewJersey...................44,88314,960019,84781471030258NewMexico...................291,8132,6661,8546,36276471970212NewYork.....................5828,25662,6169371,547342713175627NorthCarolina.................3005,14918,90113624,48628151934276NorthDakota..................351,3322,0763943,83734581712121 Ohio.........................1466,90622,631029,683222301270379 Oklahoma.....................962,5575,1422678,06210489318232Oregon.......................72,1657,2311249,5275270443169 Pennsylvania..................2038,62215,227024,052282931070428RhodeIsland..................61,5172,51704,04044718069SouthCarolina................1133,5409,094012,7471588390142SouthDakota.................221,3072,6134534,39540551427136Tennessee....................3814,00010,436014,81724125580207Texas........................71413,47347,0113061,2281494052381793 Utah.........................1122,5834,0611346,8902948237107Vermont......................161,1913,23704,444104520075Virginia.......................623,44320,872024,377151311120258Washington...................2253,32012,97643816,959331067617232WestVirginia..................241,2634,66605,953104623079Wisconsin.....................1644,3805,90921910,672241383910211Wyoming.....................131,7781,186893,0666162115139TotalU.S......................5,672187,104642,3039,263844,3421,5385,8803,53035511,303 1KeyedindependentlistinghousingunitcountsasofJanuary2000.Keyedimpliesthesecountswentthroughaqualitycontrolreview.Conse-quently,smalldiscrepanciesmayexistbetweentheseindependentlistinghousingunitcountsandthosefromTable3-8.3-16SectionIChapter3DesignoftheA.C.E.SampleU.S.CensusBureau,Census2000 THIRDPHASEOFTHEA.C.E.SAMPLEDESIGNInverylargeblockclusters,thehousingunitswithintheclusterweresubsampled.Thisachievedmanageablefield workloadsforA.C.E.interviewingandpersonfollow-up withouthavingabigimpactonreliability.Thestrategyof theA.C.E.largeblockclustersamplingplanwasto increasethenumberofclustersinsample,whilestill attainingthetargetednumberofhousingunitsforinter-view.Becausehousingunitsinablockclusterareoften similar,interviewingallofthemisnotthemostefficient useofresources.Instead,interviewingamanageablefrac-tionofseveraldifferentclustersprovidesamoregeo-graphicallydiversesample.Inthefirst-phasesampling,largeblockclustershadahigherselectionprobabilitythanmediumblockclusterstotakeintoaccountthisanticipated,subsequenthousing unitreduction.TheA.C.E.second-phasereductionmain-tainedthedifferentialselectionprobabilitiesbetweenthelargeandmediumblockclusters.Afterthereductionof housingunitsinlargeblockclusters,thehousingunitselectionprobabilitiesinmediumandlargeblockclustersinthesamesecond-phasesamplingstratumweresimilar.AnotherimportantgoalofthishousingunitreductionwastogeographicallyoverlapthePandEsamplestoreduce theE-samplepersonfollow-upworkload.AnoverlappingP andEsamplewasnotnecessary,butimprovedthepreci-sionofdualsystemestimates,thecost-effectivenessofthesucceedingoperation,andthedataprocessing efficiency.IdentifyingtheP-SampleHousingUnitsThesourceoftheP-samplehousingunits,whichweresubjecttopersoninterviewingbythefieldstaff,wasthe independentlylistedhousingunitsthatwereconfirmedtoexistfollowingthehousingunitmatchingandfollow-upoperations.(SeeChapter4.)Inblockclustersthathad fewerthan80ofthesehousingunits,allofthehousingunitsweredesignatedtobeinthePsample.Inaddition,allhousingunitsinablockclusterselectedfromthe AmericanIndianReservationstratumwereintheP sample,regardlessofhowmanyhousingunitswereintheblockcluster.Mostblockclustersfromthisstratumwereexpectedtohavefewerthan80housingunitsanditwas desirabletoavoidintroducingweightvariationtothesamplecasesforthisstratum.Forblockclusterswith80ormorehousingunits,thehousingunitsweresub-sampledandtheselectedhousingunitswereintheP sample.Thereductionofhousingunitswithinalargeblockclusterwasdonebyforminggroupsofadjacenthousingunits calledsegmentsandselectingoneormoresegmentsofhousingunitstoparticipateinthePsample.Thesegmentshadapproximatelyequalnumbersofhousingunitswithin ablockcluster.Segmentsofhousingunitswereusedasthesamplingunitsinordertoobtaincompactinterview-ingworkloadsandtofacilitateoverlappingPandE samplestoreduceE-samplepersonfollow-upworkloads.Flowofoperations.Acomplicationofthisprojectwasthatlargeblockclusterswerereadyforthehousingunit subsamplingonaflowbasisastheprecedingoperations, housingunitmatchingandfollow-up,werecompleted.Toremainonschedule,itwasessentialthattheP-samplehousingunitswereselectedandpreparedforinterviewas quicklyaspossible.Thismeantthatsamplingparameterswerecomputedbasedonthehousingunitcountsfromtheindependentlisting.Ifschedulinghadnotbeenanissue, thehousingunitcountsfromthehousingunitmatchingandfollow-upwouldhavebeenused.Thetimescheduleconstraintsdidnotpermittheentirecountrytobepro-cessedpriortosubsampling.Further,therewasnopre-specifiedorderinwhichblockclusterswerereadyforhousingunitsubsampling.Thus,followingtheflowof blockclustersfromtheprecedingoperations,thehousingunitsubsamplingwasperformeddaily.Stratifyingthird-phaseclusters.Beforeselectingthesampleofsegments,blockclustersweredividedintosevenstratawithineachstate.Thefirstfivestratawerethesamestratausedforthesecondphaseofsamplingfor themediumandlargefirst-phasestrata.Thesixthstratumwasthesmalltolargestratumjumpers,blockclustersfromthesmallstratumobservedtohavemorethan80 housingunitsduringtheindependentlisting.Theseventhstratumwasequivalenttothefirst-phaseAmericanIndianReservationstratum,forwhichnohousingunitreduction wasdone.Allocatingthesample.Nationally,thetargetdistribu-tionofthe300,000P-samplehousingunitsamplewas roughlyproportionaltopopulationsize,exceptforincreasesinsamplesizeinthesmallerstates,whichhadroughlyequalsizes.Thesecond-phaseintroduceddiffer-entialsamplingwithineachstateandgeneratedoverall targetsamplesizesforeachreductionstratuminthestate,theT gintheearliersection.Basedonthesetargetsandtheobservedsecond-phasesampleblockclusters,the samplewasallocatedtoeachstratumtoprovideapproxi-matelyequaloverallprobabilitiesofselectionforhousingunitsfromthesamestratum.Determiningsamplingparameters.Separatesam-plingparameterswerecomputedforeachstratumwithinastate.Foreachstratum,theselectionprobabilitywasthe ratioofthetargetnumberofhousingunitsfromlargeblockclustersoverthenumberofhousingunitsfromtheindependentlistinginlargeblockclusters.Within-clustersamplingrate=TargethousingunitsamplesizeinlargeblockclustersNumberoflistedhousingunitsinlargeblockclustersSectionIChapter33-17DesignoftheA.C.E.SampleU.S.CensusBureau,Census2000 Thetargethousingunitsamplesizewasderivedbysub-tractingthenumberofhousingunitsinmediumblock clustersbasedontheindependentlistfromthetarget stratumsamplesize.Whentallyingthehousingunit countsfromtheindependentlist,anyhousingunitsclassi-fiedasfutureconstructionwereomittedfromthecount. | |||
Althoughsomeofthisfutureconstructionwasprobably goingtobebuiltbyCensusDay,itwasexpectedtobea rareoccurrence.Withinaparticularstratuminastate,afixednumberofsegmentswasformedineachblockcluster.Thisnumber wasafunctionofthewithin-clustersamplingrate.This methodyieldeddifferentsizesegmentsacrossblockclus-terswithinthesamestratum.Thismethodisatrade-off betweenhavingfewersegmentstoreducenonsamplingerrorandhavingmoresegmentsofafixedsizetoreduce samplesizevariation.Nonsamplingerrorwasreducedbyhavingfewersegmentboundariestoidentify.Ifthewithin-clustersamplingratewaslessthanorequalto0.5,thenNumberofsegments1within-clustersamplingrateroundeduptothenearestinteger.Whenthewithin-clustersamplingratewasgreaterthan0.5,theaboveformularesultsinonlytwosegmentsresultinginincreasedsamplesizevariationwiththelargersegmentsize.Tobettercon-trolsamplesizevariationwhenthesamplingratewasgreaterthan0.5,thenumberofsegmentswascalculated asNumberofsegments1(1-within-clustersamplingrate)Formingthesegments.Withineachblockclusterthehousingunitsweresortedbycensusblockandgeo-graphiclocationwithintheblock.Thenbasedonthenum-berofsegments,approximatelyequalnumbersofhousing unitswereassignedtoeachsegment.Selectingasubsample.Within-clustersubsamplingwasdonedailyastheclusterscompletedthehousingunit matchingandfollow-upoperations.Despitethedailypro-cessing,thesubsamplingwasequivalenttoaone-timesample,sincetheresultsofthepreviousdaywerecarried overtothenextandcontinued.Theonedifferencewiththedailyoperationwastheinabilitytocontroltheblockclustersortacrossallblockclustersinthestratumdueto theflowoftheblockclusters.So,eachdaytheblockclus-tersthatweretobesubsampledweresortedbyblockclusternumberwithineachstratum.Asampleofsegmentswasselectedbytakingonesystem-aticsampleacrossalllargeblockclustersineachstratumwithinastate.Selectingonesystematicsamplepersam-plingstratum,ratherthanaseparatesamplefromeach largecluster,reducedsamplesizevariability.Thisallowedanobservedsamplesizeclosetothetargethousingunitsamplesizetobeachieved.P-SampleResultsFollowingwithin-clustersubsampling,thesampleforthe50statesandtheDistrictofColumbiawas11,303blockclusterscontainingabout301,000housingunits. | |||
Table3-14displaystheresultsforeachstate.3-18SectionIChapter3DesignoftheA.C.E.SampleU.S.CensusBureau,Census2000 Table3-14.StateThird-PhaseSampleResultsforthePSample StateHousingunitcounts 1byclustersize 2Blockclustercountsbyclustersize 20-7980+AIRTotal0-7980+AIRTotal Alabama...........................2,9471,50304,450115460161 Alaska.............................1,152587161,7394821170Arizona... | |||
..........................5,1932,4742,6617,66715455113322 Arkansas...........................1,79592102,71686220108 California | |||
...........................18,60814,91919233,527675415111,101 Colorado...........................2,6621,491504,153113512166 Connecticut | |||
.........................1,9711,27203,24372390111 Delaware...........................1,07769301,7704224066DistrictofColumbia | |||
..................1,1061,08402,1902929058 Florida.............................8,7366,5182015,2543292031533 Georgia............................4,6903,07207,762183930276 Hawaii.............................1,1562,44703,60347740121Idaho..............................1,6533421461,99586156107 Illinois..............................8,5103,855012,3652921110403 Indiana.............................4,1721,77305,945169420211 Iowa...............................2,16282902,991101210122 Kansas.............................2,114552292,666101151117 Kentucky...........................2,6071,37203,979111470158 Louisiana...........................3,0311,38604,417153460199 Maine..............................1,57136111,9328013194 Maryland...........................2,5742,71305,28791740165 Massachusetts | |||
......................4,5001,89306,393151590210 Michigan...........................7,2242,7561479,980259795343 Minnesota..........................3,7341,4202615,1541514210203 Mississippi | |||
..........................2,332602962,93497243124 Missouri............................3,3892,12005,509141470188 Montana............................2,1466548632,8001001524139 Nebraska...........................1,736225791,96186113100 Nevada............................1,1419731892,11470265101NewHampshire | |||
.....................1,15660901,7655323076NewJersey.........................5,3692,90208,271175830258NewMexico... | |||
.....................2,6009881,7363,5881192370212NewYork...........................9,3019,3908818,6913322905627NorthCarolina... | |||
...................4,4053,438937,843177954276NorthDakota... | |||
....................1,7804043812,184951412121 Ohio...............................7,3693,973011,3422621170379 Oklahoma..........................2,6969702603,666193318232 Oregon.............................1,8661,6061243,472127393169 Pennsylvania | |||
.......................9,4632,801012,264344840428RhodeIsland | |||
.......................1,20057401,7744920069SouthCarolina | |||
......................2,5051,99404,499103390142SouthDakota | |||
.......................1,6575204392,177951427136Tennessee... | |||
.......................4,0711,74805,819156510207Texas..............................13,0317,3312920,3625882041793 Utah...............................1,6408461222,48673277107Vermont............................1,34557101,9165718075Virginia.. | |||
...........................3,7653,12206,8871561020258Washington | |||
.........................4,0642,0434166,1071476817232WestVirginia... | |||
.....................1,10876901,8775623079 Wisconsin..........................4,3231,1862095,5091642710211Wyoming...........................1,391527831,918121135139TotalU.S.. | |||
..........................184,155108,0288,730300,9137,7743,17435511,303 1ThesourceoftheP-samplehousingunitcountswastheindependentlistthatwasconfirmedtoexistfollowingthehousingunitmatchingandfollow-upoperations. | |||
2ClustersizewasbasedonnumberofconfirmedA.C.E.housingunitsafterhousingunitmatchingandfollow-up.IdentifyingtheE-SampleHousingUnitsTheEsampleconsistedofthecensusenumerationsinthesamesampleareasasthePsample.Thesourceofthe E-samplehousingunitswastheuneditedcensusfiles.Like thePsample,allhousingunitsinblockclustersthathadfewerthan80censushousingunitsorinblockclustersselectedfromtheAmericanIndianReservationstratumweredesignatedtobeintheEsample.Forblockclusterswith80ormorehousingunits,thehousingunitswere reducedandtheselectedhousingunitswereintheE sample.SectionIChapter33-19DesignoftheA.C.E.SampleU.S.CensusBureau,Census2000 ThereductionofhousingunitswithinalargeblockclusterwasdonebymappingtheP-samplesegmentsontothe censushousingunits.Thiswaspossiblebecausewhen therewasamatchbetweenanA.C.E.independentlylisted addressandacensusaddressduringthehousingunit matching,thecensusidentificationnumberwaslinkedto theA.C.E.unit.ThenthesamesegmentselectedfortheP samplewasselectedfortheEsample.Thecensusinventoryofhousingunitschangedbetweenthehousingunitmatchingoperationandtheidentification oftheEsample.Therefore,somecensushousingunitsdidnothavealinkwithanA.C.E.unit.Thesecaseswereassignedtoasegmentusingpre-specifiedrules.Some-timestherewerealargenumberofthesecasesintheseg-mentselectedtobeinsample.Ifthereweremorethan80ofthese,thenanadditionalsubsamplewasdrawnfrom thesecensushousingunitswithoutalinktoanA.C.E. | |||
unit.Thedata-definedcensuspersonenumerationsintheE-samplehousingunitswereintheEsample.Tobeacen-susdata-definedperson,thepersonrecordhadtwo 100-percentdataitemsfilled.Namewasnotrequiredfor thepersonrecordtobeconsidereddata-defined,butcouldbeoneofthetwoitemsrequiredtobedata-defined.CensushousingunitsnotavailablefortheEsample.NotallhousingunitsontheuneditedcensusfilewereeligibletobeintheEsample.Asthecensusenu-merationswerebeingprocessed,theCensusBureausus-pectedthattherewasasignificantnumberofduplicateaddressesinthecensusfiles.Assuch,anewcensusoperation,theHousingUnitDuplicationOperation,was introducedinthefallof2000.Theprimarygoalofthisoperationwastoimprovethequalityofthecensus;how-everitsdesignallowedtheA.C.E.operationstoproceed. | |||
Essentially,suspectedduplicatehousingunitswereset asideandanalyzedfurther.Thesehousingunitsandthecorrespondingcensuspersonenumerationswerenoteli-giblefortheE-samplecomponentoftheA.C.E.noravail-ableforpersonmatchingandwereexcludedfromthedual-systemestimationcalculation.Someoftheseset-asidehousingunitsandthecorrespondingcensusenu-merationswerelaterputbackintothefinalcensuscounts.Subsamplingcriteria.Ifablockclustercontained80orfeweravailablecensushousingunits,thenallavailablecensushousingunitswereintheEsample.Iftheblock clusterwasfromtheAmericanIndianReservationstratum, allavailablehousingunitswereintheEsample.Iftheblockclusterhad80ormoreavailablecensushousingunits,thehousingunitsweresubsampled.Assigninghousingunitstosegments.Withinablockcluster,thecensushousingunitswereassignedtoaseg-mentbasedonthelinktoanA.C.E.housingunitaddress.IftherewasalinkwithanA.C.E.unit,thenthecensushousingunitwasassignedtothesamesegmentasthe A.C.E.unit.ThishelpedtocreateoverlappingPandE samples.Sometimesacensushousingunitdidnothavea linkwithanA.C.E.housingunit.Whenthishappened,all theavailablecensushousingunitsweresortedandthen eachcensushousingunitwithoutalinkwasassignedto thesamesegmentastheprecedingcensushousingunit. | |||
Whentheblockclustercontainedcity-styleaddresses,the censushousingunitsweresortedbycensusblocknum-ber,streetname,housenumber,andunitdesignation. | |||
Whentheblockclustercontainednon-city-styleaddresses, thecensushousingunitsweresortedbycensusblock numberandgeographiclocationwithintheblock.Forcity-stylecensusaddresses,geographiclocationwasnotavail- | |||
able.SelectingtheE-samplehousingunits.Onceallthecensushousingunitswithinablockclusterwereassigned toasegment,thenthecensushousingunitsintheseg-mentorsegmentsselectedforthePsamplewereintheEsample.Occasionally,theselectedsegmentorsegmentswithintheblockclustercontainedmorethan80census housingunitsthatdidnotlinktoanA.C.E.housingunit.Whenthisoccurred,anadditionalstepofsubsamplingwasdonetoreducetheEsamplefollow-upworkload, sincethecensushousingunitswithoutthislinkweremorelikelytocontributetothefollow-upworkloadthancensushousingunitswiththislink.AsystematicsubsampleofcensushousingunitswithoutalinktoanA.C.E.housingunitwasdrawn.Usingthesamesortusedforassigninghousingunitstoasegment,asub-sampleof40housingunitswasselectediftheresulting subsamplingratewasgreaterthan0.25.However,to avoidexcessivesamplingweightvariation,theminimumsubsamplingratewassetto0.25,resultinginmorethan40censushousingunitswithoutalinktoanA.C.E.hous-ingunitbeingintheEsamplefromtheparticularblockcluster.Specialcaseblockclusters.Therewerespecialcaseblockclusterswhennoneofthecensushousingunitsina blockclusterlinkedtoanA.C.E.housingunitaddressatthetimeofthehousingunitmatching.OneexampleofaspecialcasewasaList/Enumeratecluster,sincethe List/Enumerateoperationhadnotbeenconductedbythetimethatthehousingunitmatchingwasdone.NoneofthehousingunitsinaList/Enumerateclustercouldbe assignedtoasegment.Insteadofselectingacompact segmentofhousingunitstobeintheEsample,asystem-aticsubsampleofthehousingunitswasdrawnusingthesamemethodasdiscussedabove.Thispreventedoverlap-pingthePandEsampleswhentheseblockclusterswerelarge.Thisdidnothappenoften.3-20SectionIChapter3DesignoftheA.C.E.SampleU.S.CensusBureau,Census2000 E-SampleResultsFollowingE-sampleidentificationandsubsampling,theEsampleforthe50statesandtheDistrictofColumbiawas11,303blockclusterscontainingabout311,000housingunits.Table3-15displaystheresultsforeachstate.Table3-15.StateThird-PhaseSampleResultsfortheESample StateHousingunit 1countsbyclustersize 2Blockclustercountsbyclustersize 20-7980+AIRTotal0-7980+AIRTotal Alabama...........................2,7761,79304,569113480161 Alaska.............................925926161,8674425170Arizona... | |||
..........................2,8192,5212,5217,86115257113322 Arkansas...........................1,8381,11802,95685230108 California | |||
...........................17,90616,22827134,405658432111,101 Colorado...........................2,6231,587494,259114502166 Connecticut | |||
.........................2,0741,24103,31573380111 Delaware...........................1,27065901,9294521066DistrictofColumbia | |||
..................1,1441,21602,3602929058 Florida.............................8,1087,0372615,1713202121533 Georgia............................4,3733,34607,719179970276 Hawaii.............................1,3232,65303,97649720121Idaho..............................1,2438501552,24872196107 Illinois..............................8,1904,302012,4922881150403 Indiana.............................4,0821,87005,952170410211 Iowa...............................2,23790703,144102200122 Kansas.............................2,097734272,858100161117 Kentucky...........................2,3601,69204,052107510158 Louisiana...........................3,0781,80904,887152470199 Maine..............................1,59542912,0258013194 Maryland...........................2,6512,78605,43792730165 Massachusetts | |||
......................4,2492,73606,985146640210 Michigan...........................6,6823,31114610,139253855343 Minnesota..........................3,1831,7202605,1631484510203 Mississippi | |||
..........................2,3746471143,13599223124 Missouri............................3,2312,09805,329141470188 Montana............................1,4805218662,867981724139 Nebraska...........................1,637258681,96386113100 Nevada............................9071,1751332,21567295101NewHampshire | |||
.....................1,42646701,8935719076NewJersey.........................4,9523,66608,618170880258NewMexico... | |||
.....................1,1367541,5363,4261212170212NewYork...........................9,11411,0718420,2693262965627NorthCarolina | |||
......................4,5103,2531017,864182904276NorthDakota... | |||
....................1,4014823582,241951412121 Ohio...............................7,2234,016011,2392631160379 Oklahoma..........................2,3661,0382653,669193318232 Oregon.............................1,6442,3781254,147122443169 Pennsylvania | |||
.......................9,1433,449012,592336920428RhodeIsland | |||
.......................1,19455601,7505019069SouthCarolina | |||
......................2,5021,96804,470105370142SouthDakota | |||
.......................1,2784954332,206951427136Tennessee... | |||
.......................4,0222,42906,451157500207Texas..............................12,4129,2132721,6525742181793 Utah...............................1,4348181232,37575257107Vermont............................1,26364001,9035619075Virginia.. | |||
...........................3,5553,73107,2861521060258Washington | |||
.........................3,3712,6094116,3911447117232WestVirginia... | |||
.....................1,17372401,8975722079 Wisconsin..........................4,0671,1592115,4371673410211Wyoming...........................1,337554841,975121135139TotalU.S.. | |||
..........................178,978123,6408,411311,0297,6903,25835511,303 1Availablehousingunitcountsfromtheuneditedcensusfile. | |||
2Clustersizewasbasedonavailablecensushousingunittallies.SectionIChapter33-21DesignoftheA.C.E.SampleU.S.CensusBureau,Census2000 Third-PhaseSamplingResultsTable3-16givesthestateweightedandunweightedP-sampleandE-samplehousingunits.Alsodisplayedare theaverageP-sampleandE-sampleweights,priorto weighttrimming,TESadjustment,andnonresponse adjustments.Theaverageweightsrangedfromapproxi-mately100to500.InTable3-16,formostofthestates,theaverageE-sampleweightissmallerthantheaverageP-sampleweight. | |||
Nationally,despitetheP-andE-samplesizesdifferingby about10,000housingunits,afterapplyingtheweight,the weightednumberofP-samplehousingunitsislessthan onepercentlargerthantheweightednumberofE-samplehousingunits.Table3-16.P-SampleandE-SampleHousingUnitSamplingResults StateWeightedhousingunitestimatesHousingunitsamplesizesAverageweightPsampleEsampleP/EPsampleEsampleP/EPsampleEsample Alabama...........................1,967,7031,953,5591.0074,4504,5690.974442428 Alaska.............................186,971187,6570.9961,7391,8670.931108101Arizona... | |||
..........................2,291,7352,419,0980.9477,6677,8610.975299308 Arkansas...........................1,204,0141,214,8780.9912,7162,9560.919443411 California | |||
...........................12,255,06612,129,8491.01033,52734,4050.974366353 Colorado...........................1,633,9801,579,0701.0354,1534,2590.975393371 Connecticut | |||
.........................1,262,1971,249,7921.0103,2433,3150.978389377 Delaware...........................282,962285,5570.9911,7701,9290.918160148DistrictofColumbia | |||
..................295,972295,0991.0032,1902,3600.928135125 Florida.............................7,350,6676,958,7991.05615,25415,1711.005482459 Georgia............................3,178,0033,101,3371.0257,7627,7191.006409402 Hawaii.............................446,780467,5820.9563,6033,9760.906124118Idaho..............................475,978494,3770.9631,9952,2480.887239220 Illinois..............................4,752,6164,723,1751.00612,36512,4920.990384378 Indiana.............................2,565,5592,611,2480.9835,9455,9520.999432439 Iowa...............................1,286,1591,303,3930.9872,9913,1440.951430415 Kansas.............................1,054,2771,085,0660.9722,6662,8580.933395380 Kentucky...........................1,738,6371,688,3591.0303,9794,0520.982437417 Louisiana...........................1,690,0931,767,4980.9564,4174,8870.904383362 Maine..............................606,684580,6711.0451,9322,0250.954314287 Maryland...........................2,240,4632,237,8111.0015,2875,4370.972424412 Massachusetts | |||
......................2,637,7322,652,6990.9946,3936,9850.915413380 Michigan...........................3,945,5683,948,3480.9999,98010,1390.984395389 Minnesota..........................1,976,4101,940,3021.0195,1545,1630.998383376 Mississippi | |||
..........................1,067,3931,065,4951.0022,9343,1350.936364340 Missouri............................2,678,9092,576,5451.0405,5095,3291.034486483 Montana............................463,607459,8841.0082,8002,8670.977166160 Nebraska...........................684,874667,5861.0261,9611,9630.999349340 Nevada............................895,050862,5091.0382,1142,2150.954423389NewHampshire | |||
.....................558,641523,5621.0671,7651,8930.932317277NewJersey.........................3,377,9083,338,7681.0128,2718,6180.960408387NewMexico... | |||
.....................708,714667,6201.0623,5883,4261.047198195NewYork...........................7,573,2927,706,5260.98318,69120,2690.922405380NorthCarolina... | |||
...................3,857,1663,748,5391.0297,8437,8640.997492477NorthDakota | |||
.......................294,040288,6771.0192,1842,2410.975135129 Ohio...............................4,785,4614,687,6801.02111,34211,2391.009422417 Oklahoma..........................1,461,1631,465,0460.9973,6663,6690.999399399 Oregon.............................1,411,6811,431,0300.9863,4724,1470.837407345 Pennsylvania | |||
.......................5130,0105,179,1750.99112,26412,5920.974418411RhodeIsland | |||
.......................408,426401,0221.0181,7741,7501.014230229SouthCarolina | |||
......................2,274,3892,332,4850.9754,4994,4701.006506522SouthDakota | |||
.......................300,952297,4921.0122,1772,2060.987138135Tennessee... | |||
.......................2,489,6072,609,9190.9545,8196,4510.902428405Texas..............................8,116,2158,098,9231.00220,36221,6520.940399374 Utah...............................885,164823,2551.0752,4862,3751.047356347Vermont............................307,822296,4141.0381,9161,9031.007161156Virginia.. | |||
...........................2,714,8792,797,8360.9706,8877,2860.945394384Washington | |||
.........................2,496,2692,435,1451.0256,1076,3910.956409381WestVirginia... | |||
.....................917,901916,5521.0011,8771,8970.989489483 Wisconsin..........................2,274,7732,268,9761.0035,5095,4371.013413417Wyoming...........................190,271194,8440.9771,9181,9750.9719999UnitedStates.......................115,650,803115,016,7291.006300,913311,0290.9673843703-22SectionIChapter3DesignoftheA.C.E.SampleU.S.CensusBureau,Census2000 Attachment.Census2000TypeofEnumerationAreas(TEAs) 1ThetermTEAhasbeenusedforseveraldecennialcen-suses.ForCensus2000,itreflectsnotonlythetypeofenumeration,butalsothemethodofcompilingthecensusaddresslistthatcontrolstheenumerationprocess.TheCensusBureaudefinesTEAcodesatthecensuscollec-tionblocklevel.EachblockmusthaveaTEAcode,andnoblockmayhavemorethanoneTEAcode.TEA1-BlockCanvassingandMailout/Mailback*Containsareaswithpredominantlycity-style(housenumber/streetname)addressesusedformaildelivery.*CensusaddresslistiscreatedfromUSPS,1990census,local/tribal,andotherpotentialsupplementaryaddress sources.*BlocksareincludedinbothBlockCanvassingandthePostalValidationCheck.*Blocksareincludedinlocal/tribalprogramtoidentifynewconstruction.Mailout/mailbackisthemostefficient,cost-effectiveenu-merationmethodinheavilypopulatedareasinwhichmail isdeliveredtocity-styleaddressesinvirtuallyallcases(theremaybescatterednon-city-stylemailingaddressesinuseintheseareas).Inmostinstances,acensusenumera-torvisitsaresidenceonceduringBlockCanvassing.AsubsequentvisitissometimesnecessaryduringNonre-sponseFollow-up.Themailinglistusedforthisoperationisderivedinitiallyfromautomatedaddressfiles(theUSPSDeliverySequenceFileandthe1990CensusAddressControlFile),and updatedthroughvariousoperations,includingAddressListReview(LUCA1998),ongoingDSFupdates,BlockCan-vassing,thePostalValidationCheck,andtheNewCon-structionProgram.TEA2-AddressListingandUpdate/Leave*Containsareaswithsomenumberofnon-city-style(e.g.,P.O.BoxorRuralRoute)mailingaddresses.*CensusaddresslistiscreatedfromAddressListing,andupdatedfromAddressListReview(LUCA)1999Recan-vassing(inselectedareas)andUpdate/Leave*BlocksareNOTincludedinBlockCanvassing,thePostalValidationCheck,ortheNewConstructionProgram*PuertoRico,includingitsmilitarybases,iscompletelyinTEA2AddressListingandUpdate/Leaveareimplementedinareaswheremailoftenisdeliveredtonon-city-styleaddresses.Intheseareas,itisdifficulttoobtainanup-to-datemailingaddresslistandthengeocodeeachaddress (thatis,assignittoacollectionblockcode),becauseof theconstantlychangingresidentiallocation/mailing addressrelationship(especiallyforP.O.Boxaddresses). | |||
Thecensusaddresslistthereforeiscompiledthrougha door-to-doorindependentlistingoperation(AddressList-ing)thatisimplementedinallTEA2blocks.DuringAddressListing,enumeratorsknockoneachresi-dencedoortoobtaintheoccupantsname,phonenumber,residentialaddress(orlocationdescription),andmailingaddress.(EnumeratorsdoNOTrevisitresidenceswhose occupantsarenotpresent.ThisiswhythecensusaddresslistfrequentlydoesNOTcontainamailingaddress,andwhythelocationdescriptionistheONLYaddressinthe censusaddresslistformanyresidences.)Enumerators identifythelocationofeachbuilding(containinglivingquarters)theyencounterwithauniquelynumberedmapspotthattheyenterontheirmapandrecordintheir addressregister;thisnumberislinkedtoallresidentialunitsinthebuilding,andstoredinboththecensusaddresslistandtheTIGERdatabase.Theseareaswillbe includedinAddressListReview(LUCA)1999.Atcensustime,enumeratorsdelivercensusquestion-nairestoallhousingunitscompiledduringAddressList-ingandthatremaininTEA2.Inthecourseofdelivering thesequestionnaires,theenumeratorsalsoupdatethecensusaddresslistandmapspottedmaptoreflecthous-ingunitsthatwerenotpreviouslylisted,andtoeliminate residencesthattheycannotlocate.(ThisoperationiscalledUpdate/Leave,becausetheenumeratorsUPDATEthecensusaddresslistandmapsandLEAVEquestion-naires.)Update/Leaveenumeratorsusetheresidential address/locationdescriptioninconjunctionwiththemapspotlocationtodeterminethecorrectdeliverypointforallquestionnaires.MosthousingunitsinTEA2areasarevisitedatleasttwicebyenumeratorsonceduringAddressListing,andagainduringUpdate/Listing.Respondentsmustmailtheircom-pletedcensusquestionnairestotheCensusBureau,andso someresidencesalsowillbevisitedathirdtime,duringNonresponseFollow-up. | |||
1ThisdocumentationisreproducedfromtheGeographyDivision,U.S.CensusBureau,Websitelocatedathttp://www.geo.census.gov/mob/homep/teas.html.SectionIChapter33-23DesignoftheAccuracyandCoverageEvaluationSampleU.S.CensusBureau,Census2000 TEA3-List/Enumerate*Containsareasthatareremote,sparselypopulated,ornoteasilyaccessible*Censusaddresslistiscreatedandenumerationcon-ductedconcurrently*BlocksarenotincludedinBlockCanvassing,thePostalValidationCheck,theNewConstructionProgram,orAddressListing*IncludesallmilitarybasesinTEA3areas*Allislandareas(exceptPuertoRico),includingtheirmili-tarybases,areTEA3Someareasareremote,sparselypopulated,and/ornoteasilyvisited.Manyoftheresidencesintheseareasdonothavecity-stylemaildelivery.Itisinefficientandexpensive toimplementAddressListing,Update/Listing,andNonre-sponseFollow-upoperationsinvolvingmultiplevisits.Instead,thecreationoftheaddresslistandthe delivery/completionofthecensusquestionnaireareaccomplishedduringasingleoperation,List/Enumerate.EnumeratorsvisitresidencesinTEA3blocks,LISTthem forinclusioninthecensusaddresslist,marktheirlocationontheirmapwithamapspotandnumber,enterthatmapspotnumberintheiraddressregister,andENUMERATEthe residentson-site.Theycollectthesameaddressinforma-tionasinAddressListing,andincludeamapspottoreflecteachbuildingthatcontainsoneormorelivingquar-ters.TheseareaswillNOTbeincludedinanyAddressList Review(LUCA)program,becausethereisnoaddresslistfortheminadvanceofthecensus.TEA4-RemoteAlaska*SimilartoList/Enumerate,butconductedearlier,beforeicebreakup/snowmelt*TheseareaswillNOTbeincludedinanyAddressListReview(LUCA)program,becausethereisnoaddresslistfortheminadvanceofthecensusTEA5-RuralUpdate/Enumerate*ContainsblocksinitiallyinTEA2,withmapspotsforallstructurescontainingatleastonehousingunit*Insomeinstances,blocksinitiallyinTEA3willbecon-vertedtoTEA5.Theseblockswerenotincludedin AddressListingandLUCA1999,andthereforelack structuresandmapspotsintheMAFandTIGERatthetimesthatLUCA1999andRuralUpdate/Enumerateareconducted*Self-enumeration(throughUpdate/Leave)isthoughttobeunlikelyorproblematic*Censusaddresslistisupdated,andenumerationiscon-ducted,concurrently*BlocksareNOTincludedinthePostalValidationCheckortheNewConstructionProgram*ThetermruralreflectsAddressListingastheinitialsourceofthecensusaddresslist,anddoesNOTreflect theofficialcensusdefinitionofthetermrural*TheseareaswillbeincludedinAddressListReview(LUCA)1999materials,astheMAFwascompiledini-tiallyfromAddressListingInsomeareasthatotherwisemeetthecriteriaforinclu-sioninTEA2,theCensusBureauhasdecidedthathavingrespondentsenumeratethemselvesandreturntheirques-tionnairesviathemailisnotthebestwaytoconducttheenumeration.Sometargetedpopulationsmaybelesslikelytoreturntheirquestionnairesinthemail,andmore likelytorespondtoanenumerator.Inotherareas,housingunitsmaybevacantbecausetheyareoccupiedseasonally.Intheseandcomparablesituations,enumeratorsvisitallresidencesonthecensusaddresslistandcompletetheenumerationon-site.Inthecourseofdeliveringthese questionnaires,theyalsoupdatethecensusaddresslistto1)reflecthousingunitsthatwerenotpreviouslylisted(includingamapspottoreflecteachbuildingthatcon-tainsoneormorelivingquarters),and2)eliminatehous-ingunitsthattheycannotlocate.(ThisoperationiscalledRuralUpdate/Enumerate,becausetheenumeratorsworkinareasthatwereAddressListed,UPDATEthecensus addresslist[andassignmapspotsaswell],andENUMER-ATEtheresidents.)TEA6-Military*ContainsblockswithinTEA2thatareonmilitarybases*Mailout/Mailbackforfamilyhousing*Separateenumerationproceduresforbarracks,hospi-tals,etc.*BlocksareincludedinbothBlockCanvassingandthePostalValidationCheck*TheseblocksareincludedinAddressListReview(LUCA)1998materials,astheMAFwascompiledinitiallyinthesamemannerasTEA1areasTheDepartmentofDefensehasadvisedtheCensusBureauthatvirtuallyallfamilyhousing(thatis,individual residencesasopposedtobarracks,hospitals,andjails)areassignedcity-styleaddressestowhichthePostalServicedeliversmail.TheCensusBureauthereforeimplements Mailout/Mailbackmethodstoenumeratethepopulationoftheseindividualresidences.WithinTEA1areas,blocksonmilitarybasesareassignedaTEAcodeof1.WithinTEA2 areas,blocksonmilitarybasesareassignedaTEAcodeof6.ThereisnodifferencebetweenTEA1blocksonmilitary3-24SectionIChapter3DesignoftheAccuracyandCoverageEvaluationSampleU.S.CensusBureau,Census2000 basesandTEA6blocksintermsofeithercompilingthecensusaddresslistorenumeratingthepopulation.Blocks withinmilitarybasesinList/Enumerateareas(TEA3)also areTEA3.TEA7-UrbanUpdate/Leave*ContainsblocksinitiallyinTEA1 | |||
*Censusaddresslistisupdated,andquestionnairesaredeliveredconcurrently,byCensusBureaustaff(follow-ingproceduresemployedinTEA2areas,butwithoutassigningmapspots)*BlocksAREincludedinthePostalValidationCheckandtheNewConstructionProgram*Thetermurbanreflectsthepredominanceofcity-styleaddresses,anddoesNOTreflectofficialcensusdefini-tionofthetermurban*TheseblocksareincludedinAddressListReview(LUCA)1998materials,astheMAFwascompiledinitiallyinthesamemannerasTEA1areasInmanyareaswheremailisdeliveredmostlytocity-styleaddresses,olderapartmentbuildingsarecommon.Inmanyofthesebuildings,unitdesignators(thatis,apart-mentnumbers),oftendonotexist.Further,thesubdivi-sionofexistingunitsintomultipleunits,andtheconver-sionofnon-residentialspacetolivingquarters,maybefrequent.Mail,therefore,oftenisnotdeliveredtoindi-vidualapartments(orindividualmailboxes),butinstead leftatcommondroppoints.Insomeotherareaswithmostlycity-styleaddresses,manyresidentshaveelectedtoreceivetheirmailatpost officeboxes.TheCensusBureauisconcernedthatthecity-styleaddressesoftheseresidentsmaynotappearinthecensusaddresslist.Toensurequestionnairedeliverytothelargestnumberofresidences,Update/Leaveproceduresareemployed.Astheseresidenceshavecity-styleaddresses,thereisno needforenumeratorstoassignmapspotstoassistenu-meratorsinidentifyingtheseresidencesinsubsequent operations.TEA8-UrbanUpdate/Enumerate*ContainsblocksinitiallyinTEA1,withoutmapspotsforanyaddresses;mapsgeneratedforTEA8areaswillnotincludemapspots*ContainsmostlyblocksonthoseAmericanIndianreser-vationsthatinitiallywereincludedinbothTEA1and eitherTEA2or3*SameenumerationproceduresasTEA5*ThetermurbanreflectstheinitialinclusionoftheblockinTEA1duetothepredominanceofcity-style mailingaddresses*TheseareasareincludedinBlockCanvassingandthePostalValidationCheckMostAmericanIndianReservationswillbeenumeratedusingasingleenumerationprocedure(Mailout/Mailback, Update/Leave,orUpdate/Enumerate).Someoftheseini-tiallycontainedblockswithamixtureofTEAcodes.Intheseinstances,thereservationswillbeenumeratedusing Update/Enumeratemethods(seeTEA5).However,foraffectedblocksinitiallyinTEA1,theMAFandTIGERdonotincludemapspotsforstructurescontainingatleast onehousingunit.InsteadofconvertingtheseblockstoTEA5(RuralUpdate/Enumerate)anddeterminingmapspotlocations,theblocksarebeingdistinguishedbya separateTEA.TEA9-AdditionstoAddressListingUniverseof Blocks*Containsgroupsofblocks(assignmentareas)initiallyassignedtoTEA1*ConvertedtoAddressListingbeforeBlockCanvassingis conducted*BlocksareNOTincludedinBlockCanvassing,thePostalValidationCheck,ortheNewConstructionProgramSomeblocksthatareinTEA1containasignificantnum-beroflivingquarterswithnon-city-styleaddresses.TheseblocksshouldnotbeincludedinBlockCanvassing,whichisanoperationthatisdesignedtoconfirmandcorrectthe existenceand/orlocationofcity-styleaddresses.TheGeographyandFieldDivisionsareidentifyingBlockCan-vassingassignmentareas(AAs)thatlikelycontainblocks withsignificantnumbersofnon-city-styleaddresses. | |||
SomeoftheseAAswillberemovedfromBlockCanvass-ing,andincludedinAddressListing.TheblocksintheseAAswillbeassignedaTEAcodeof9,andthecensus addresslistcompilationandcensusenumerationactivitiesinTEA9blockswillbevirtuallyidenticaltothoseinTEA2blocks(forinstance,theywillbeincludedinUpdate/Leave andNonresponseFollow-up).Becausemostoftheseblockshadfew,ifany,addressesintheMAFfromtheUSPS,theentitiestheblocksareinmostlyhadnothingtoreviewduringAddressListReview (LUCA)1998.Forthisreason,mostoftheseblockswill havetheirAddressListReviewedduringanewphaseofLUCA,oftencalledLUCA991/2.SectionIChapter33-25DesignoftheAccuracyandCoverageEvaluationSampleU.S.CensusBureau,Census2000 Chapter4.A.C.E.FieldandProcessingActivities INTRODUCTIONThischapterdescribestheoperationalaspectsoftheA.C.E.whichconsistedoffourmajoractivities:housingunitlisting,housingunitmatching,personinterviewing, andpersonmatching.Housingunitlistingandpersoninterviewswereconductedasfieldactivities,whereashousingunitmatchingandpersonmatchingwereprocess-ingactivitiescarriedoutintheNationalProcessingCenter(NPC)inJeffersonville,Indiana.Asdescribedearlier,allof theseactivitieswerecompletedpriortoestimation.Oncethesampleclusterswereselected,interviewersvisitedtheclustersandindependentlylistedallhousingunits.TheA.C.E.andcensushousingunitswerethenmatchedand, forthoseforwhichamatchwasnotfound,afollow-up interviewwasconductedtodeterminethestatusofthehousingunitatthetimeofthecensus.Followingtheresolutionofthehousingunitnonmatches,interviewswereconductedwithresidentsoftheA.C.E. | |||
samplehousehold(Psample)toobtaintherosterofhouseholdresidentsandthedetailrequiredformatching.TheP-samplepersonswerethenmatchedtothelistof personsenumeratedinthecensusinthesampleclusters.Thesearchareawasexpandedtoincludeoneringofsur-roundingblocksforthoseclustersidentifiedascontaining potentialcensusgeocodingerrors.Thisoperationwascalledthetargetedextendedsearch(TES)becauseittar-getedclusterswithhighratesofA.C.E.housingunitnon-matchesandcensushousingunitgeocodingerror.Afur-therfollow-upinterviewwasconductedforselectedmismatchedpeopleforwhomadditionalinformationwasrequired.Basedontheseactivities,eachpersoninthe sampleclusters,whetherinterviewedintheA.C.E.sample(Psample)orfoundinthecensus(Esample)wasassignedafinalmatchstatuscode.ItisimportanttopointoutsomekeyimprovementsoftheA.C.E.2000operationsoverthe1990Post-EnumerationSurvey(PES).The2000A.C.E.improvedon1990PESinseveralwaysforinterviewingandclericalmatching.*Oneproblemin1990wasthemisreportingofCensusDayaddresses,withanestimated0.7percentofthePsamplebeingerroneouslyreportedasnonmovers(West1991).TheComputerAssistedPersonalInterview(CAPI) instrumentimprovedthequalityofthereportingof moverstatusbecauseitwasamoreautomatedprocess.In2000,theCensusDayhouseholdconsistedofnon-moversandoutmovers.ThenonmoverslivedinthehousingunitatthetimeoftheinterviewandonCensusDay.TheoutmoverslivedinthehousingunitonCensusDay,butmovedbeforetheA.C.E.interview.Nonmovers andoutmoversinthePsamplewerematchedtocensuspeopleintheirblockcluster.In1990,eachinmoverhousehold(thosethatmovedintoPESblockclusters afterCensusDay)hadtobematchedtoaCensusDayaddress,whichwasusuallyoutsidethecluster.In2000,thereconstructedCensusDayhouseholdwasmatched tothecensusenumerationsinthesampleblockcluster.*Astudyofclericalerrorinthe1990PESfounderrorincodingmatches(Davis1991)anderroneousenumera-tions(Davis1991b).In1990,codeswereenteredintoacomputersystem,buttheactualmatchingandduplicatesearchesweredoneusingpaper.Inthe2000A.C.E.,the matchingwasbettercontrolledandmoreefficientthan 1990becausetheclericalmatchingandqualityassur-ancewereautomatedandcodeddirectlyintotheauto-matedsystem.Theautomatedinteractivesystemdid notpreventallmatchingerror,butreducedthechancesforerrorsignificantly.Softwareallowedsearchingformatchesinthecensusbasedonfirstnames,lastnames, characteristics,andaddresses.Forexample,thesystemallowedsearchingforallpeoplenamedGeorge,allpeoplewhoselastnamebeginswithanH,allpeopleon ElmStreet,oreveryoneintheage30to40range.The softwarecontrolledthematchcodesthatwererelevanttothesituation.Forexample,onlyP-samplenonmatchcodescouldbeassignedtoaP-samplenonmatch.*Theelectronicsearchesforduplicatesreducedthetedioussearchingthroughpaperlistsofcensuspeople.Thesearchingin1990waslimitedtoprintoutsintwo sorts:lastnameandhouseholdbyaddress.In2000,theclerkshadthecapabilitytofilteronname,characteris-tics,andaddresstohelpidentifyduplicates.Thesystem monitoredwhetherthematcherhadcompletedallthe necessarysearches,suchaslookingforduplicates.*Therewerebuilt-ineditstoensureconsistencyofcod-ing.Forexample,codesthatappliedtoahousehold, suchasgeographiccodes,wereassignedtoallpeopleinthehousehold.Thesystemautomaticallyassignedcertaincodes,reducingcodingerror.*Clericalmatcherscoulduseacodeindicatingthecaseneededreviewatthenextlevelofmatching.Thiscodeallowedthemtoflagunusualcasestobeexaminedbya personwithmoreexperience.SectionIChapter44-1A.C.E.FieldandProcessingActivitiesU.S.CensusBureau,Census2000 | |||
*Allqualityassurancefortheclericalmatchingwasauto-mated.*ClericalmatchingwascentralizedattheNPCinsteadofhavingseparategroupsofmatchersintheseven processingoffices,aswasdonein1990.Forty-sixtech-nicianswerehiredinAugust,1999andwerethoroughly trainedinthedesignoftheA.C.E.andmethodsof matchingpeopleandhousingunits.Thesetechnicians wereresponsibleforqualityassuranceoftheclerical matchers.Additionally,tenanalystswhowereamong themostexperiencedmatchersconductedquality assuranceforthetechniciansandhandledthemost difficultcases.*Thecomputermatcheridentifiedmatchesandpossiblematcheswithinablockcluster.Additionalcomputerprogramswereusedtocheckthematchingoncasesafterthebeforefollow-upclericalmatchingtoidentify matchesandduplicatesintheexpandedsearchareathatwerenotidentifiedbytheclericalmatchers.Consis-tencycheckswerealsoperformedbetweenhousingunit andpersonmatchcodes.*Keyingerrorinthedatacaptureofthe1990PESwasreducedbecausethe2000interviewusedaCAPIinstru-ment.Amoreaccuratecaptureofthedataincreasedthe efficiencyofthecomputermatching.HOUSINGUNITLISTINGThefirststageofsamplingwastheselectionofA.C.E.blockclusters.Then,inSeptemberthroughDecemberof1999,alistingoftheaddressesofallthehousingunitsin theA.C.E.sampleclusterswasconducted.Thelistingwasindependentofthecensus.Traininginhowtolistbothcity-styleandnon-city-styleareaslasted3daysand includedareviewofthefirstcompletedclusterassigned toeachlister.Therewere29,136sampledblockclustersinthe50statesandtheDistrictofColumbia.ThislistofhousingunitsrecordedintheIndependentListingBooks (ILB)becametheframeofA.C.E.housingunitsfromwhichthePsamplewaslaterselected.Besideslistingeachhous-ingunitinthecluster,thelistersinquiredabouthousing unitspresentateachspecialplaceandcommercialstruc-ture.Thehousingunitlistingwasbybasicstreetaddress.Eachbasicstreetaddresswasassignedamapspotnumberand themapspotnumberwasrecordedontheA.C.E.maptoidentifythelocationofthebasicstreetaddress.Theaddressandcoveragequestionsaboutthestructurewere askedforeachbasicstreetaddress.Thenumberofhous-ingunitsatthebasicstreetaddresswasobtainedfromahouseholdmemberattheaddress,byproxy,fromthe apartmentmanager,orbyobservation.ThiscontacthelpedtoimprovethecoverageofhousingunitsintheA.C.E.Apageinthelistingbookforsingleandmultiunit structuresisshowninFigure4-1.Theindividualhousingunitswithinabasicstreetaddresswerelistedonthepagesofthelistingbookreservedformultiunits.Also,the A.C.E.listerrecordedthenumberofunitswithinabasic streetaddressonthemapinparenthesestoconformwith censusmethodology.Mobilehomesthatwerenotinmobilehomeparkswerelistedlikesingleunits.Eachmobilehomewasassignedauniquemapspotnumberandeachmobilehomewas listedonaseparatelineinthelistingbook.Ifthemobilehomeswereinapark,theparkwaslistedinthehousingunitsectionofthelistingbook,andeachindividualmobilehomeandvacantsitewaslistedinthemobilehomepark sectionofthelistingbook.Eachindividualmobilehomewasassignedauniquemapspotnumber,whetherthemobilehomewasinaparkornot.Thelocationofthe mobilehomewasidentifiedbyplacingthemapspotnum-berforthemobilehomeonthemap.Thiswasthesameprocedurethatwasusedinthecensus.Thefollowingitemswerecollectedandrecordedinthelistingbookforeachbasicstreetaddress:*City-styleaddresses(housenumberandstreetnames) | |||
*Non-city-styleaddresses(routenumbers,routeandboxnumbers,oranyothertypeofaddressthatwasnotacity-styleaddress)*Householdernames(ruralareasonly)*Descriptionofaddresses(foronlynonhousenumberaddressesinbothurbanandruralareas)*Numberofhousingunitsinabasicstreetaddress*Typeofbasicstreetaddress(singleunit,multiunit,mobilehomenotinamobilehomepark,mobilehome inamobilehomepark,housingunitinspecialplace,multiunitinaspecialplace,orother)*Unitstatusforsingleunits(occupiedorintendedforoccupancy,underconstruction,futureconstruction,unfitforhabitation,boardedup,storageofhouseholdgoods,andother)Thefollowingitemswerealsocollectedandrecordedinthelistingbookforeachunitwithinamultiunitbasicstreetaddress:*Unitdesignation*Unitstatusformultiunits(occupiedorintendedforoccupancy,underconstruction,futureconstruction, unfitforhabitation,boardedup,storageofhouseholdgoods,andother)Thefollowingitemswerealsocollectedandrecordedinthelistingbookforeachmobilehomeinamobilehome park:*Housenumber,lotnumber,orphysicaldescription*Streetname4-2SectionIChapter4A.C.E.FieldandProcessingActivitiesU.S.CensusBureau,Census2000 | |||
*Ruraladdress*Unitstatus(intendedforoccupancy,unfitforhabitation,boardedup,storageofhouseholdgoods,vacanttrailer siteinamobilehomepark,andother)AfterthelistingbookswerereceivedinNPC,theywerecheckedinandthedatakeyedintoacomputerfile.Thekeyingqualityassurancewas100percent.Keyingrejects werereviewedclericallytocorrecterrorsbeforethematchingbegan.AdatafileoftheA.C.E.housingunitswascreatedtobeusedasinputtothehousingunit | |||
matching.HOUSINGUNITMATCHThehousingunitmatchingconsistedoffoursteps:com-putermatching,clericalmatching,housingunitfollow-up,andafterfollow-upcoding.TheA.C.E.housingunitswere comparedtothecensushousingunitswithinclusterbycomputer,andthen,clerically.Housingunitsthatdidnotmatch,possiblematches,andpossibleduplicateswere followedupbyfieldinspectionandinterview.Theresultsofthefollow-upinterviewwererecordedduringtheafterfollow-upcoding.Thepurposeofhousingunitmatchingwastocreatealistofaddressesthatexistedashousingunitsintheblock clusteronCensusDaytouseintheP-sampleinterviewing.ThehousingunitlistingwasconductedintheFallof1999.Addressesthathadachancetobehousingunitson CensusDay,suchasunderconstruction,futureconstruc-tion,andvacanttrailersites,werelisted.Afterthehousingunitmatchingandfollow-up,onlythehousingunitsorigi-nallylistedandconfirmedtoexistashousingunitswere includedinthePsampleforCAPIinterviewing.Housingunitswithunresolvedstatuswerealsoincludedinthe interviewing.Computermatchingwasconductedafterthesecondphaseofsampling,whichconsistedofsamplereduction andsmallblocksubsampling.Theresultsofthecomputermatchingwerereviewedclerically.Allmatchingwascon-ductedwithinthesampleblockclusters.Thecensus addressesweretheonescontainedintheJanuary,2000versionoftheDecennialMasterAddressFile(DMAF).Thiswasnotthefinalversionoftheinventoryofcensus addresses,becauseoflateroperations.Theinventoryof censushousingunitswasfinalaftertheHundredPercentCensusUneditedFile(HCUF)wascompleted.Asnotedearlier,thePandEsampleswerelocatedinthesameblockclusters.TheadvantagesoflinkingtheA.C.E. | |||
andcensushousingunitswere:*ThelinkofA.C.E.andcensusaddressesallowedanoverlappingPsampleandEsample,(i.e.,thehousingunitsselectedforthePsampleweremostlythesameasthoseintheEsample)eliminatingtheerrorpronecleri-calE-sampleidentificationrequiredtoachievetheover-lappingsamplesinthe1990PostEnumerationSurvey.*Thelinkingofaddressesalsoallowedthepersoninter-viewingtobeginearlieronthetelephoneusingthecen-sustelephonenumberforthecensusquestionnairereturnedbymail.Thetelephonenumberfromthecen-susquestionnairewasnotavailablewithoutthelinkbetweentheA.C.E.housingunitandthecensusques-tionnaireforthathousingunit.Aftersamplereductionofclusters,therewere11,303clus-tersinthe50statesandtheDistrictofColumbia.SeeChapter3foradiscussiononthesamplereduction.The 420clustersinlist/enumerateareaswerenotmatched,becausetheircensusaddresseswerenotavailableinthe Springof2000.Therefore,10,883clusterswerematchedinthehousingunitphase.Table1containsthenumberofhousingunitsandclustersinhousingunitmatchingfortheA.C.E.Thecensusnumberswerepreliminary;these weretheaddressespriortomailingthecensusquestion-naires.Subsequentcensusoperationsaddedandremovedaddressesfromthislist.Eventhoughthiscensuslistcon-tainsmorehousingunitsthantheA.C.E.,thiswasnotindicativeofcoveragedifferencesduetothepreliminarynatureofthecensusnumbers.SeeChapter3foradiscus-sionofthefinalP-sampleandE-samplehousingunitsand howtheycompare.Table4-1.SampleSizesfortheA.C.E.HousingUnitMatching Clusters Housing unitsClusterswithhousingunits | |||
............. | |||
10,157A.C.E.housingunits | |||
.................. | |||
838,427Censushousingunits | |||
................. | |||
859,296Clusterswithouthousingunits...........726Totalclustersinhousingunitmatching | |||
....10,883ComputerMatchThecensushousingunitsincludedontheDMAFinJanuary,2000,intheblockclustersretainedintheA.C.E. | |||
aftersamplereductionandsmallblocksubsampling,were usedinthehousingunitmatching.ThehousingunitdatafromtheindependentlistingbookfileandtheDMAFextractwentthroughaseriesofdatapreparationsteps, includingaddressstandardization.Addressesfromeitherfilethatwereblankorcouldnotbestandardizedwerematchedclerically.Theresultsofthecomputermatching andimagesoftheA.C.E.andcensusmapswithmapspotsinruralareaswereinputsintoanautomatedreviewandcodingsoftwareforclericalmatching.ClericalMatchTheclericalmatchersusedtheresultsofthecomputermatchingtoaidintheirmatchingofaddressesfromtheSectionIChapter44-3A.C.E.FieldandProcessingActivitiesU.S.CensusBureau,Census2000 A.C.E.andthecensus.Therewere115clerks,46techni-cians,andtenanalystsinvolvedinthematchingopera-tion.Theclerkscarriedoutthematching.Thetechnicians appliedqualitycontroltothematchingperformedbythe clerks.Theanalystscarriedoutqualitycontrolonthe workofthetechnicians.Theclerksandtechniciansuseda reviewcodewhentheysawsomethingunusualorsome-thingthatshouldhavebeenlookedatbythenextlevelof matcher.Thetechniciansandanalystsexaminedthecases codedforreviewinthepreviousstageofmatching,in additiontocasesselectedforqualitycontrol.Theclerks usedinthehousingunitmatchingweregiven4weeksof training.ThetechnicianswerehiredinAugust,1999and givenextensivetrainingonthebackgroundofcoverage measurementandthedesignoftheA.C.E.allowingthem tomakemoreinformeddecisions.Theanalystswereour mostexperiencedpeople.Theanalystshaveworkedon coveragemeasurementformanyyearsandwerequite knowledgeableabouttheA.C.E.Thethreelevelsofstaff producedahighqualityofmatchingwithacost-efficient | |||
operation.TheclericalmatchingwasconductedinthehousingunitmatchingphaseoftheA.C.E.onlyforclustersexpectedtobenefitfromfurtherexamination.Sinceclericalmatchingwaslaborintensive,theamountofclericalworkper-formedforthe2000A.C.E.wasreducedbyanautomatedidentificationofclustersforfollow-upinterviewingwith-outclericalreview.Theseclustershadonlyafewnon-matchesornonmatchesononlyoneside.Forexample,therecouldbe25A.C.E.nonmatchesandnocensusnon-matches,sotherewasnothingtheclericalmatcherscould do.Theclericalmatcherswerethusabletoconcentrateon themoredifficultclusterswherethereviewwasbenefi-cial.In2000,3,267clustersweresenttothefieldforthefollow-upphasewithoutclericalreview.Supplementalmaterialswereprovidedtofacilitatetheclericalmatching,suchasthemapswithspotstoidentifythelocationofA.C.E.andcensusaddressesinruralareas.TheA.C.E.andcensusaddressesthatcouldnotbe matchedbythecomputerwereidentifiedfortheclericalmatching.Thematchedaddresseswerenottargetedforreview,becauseexperienceinstudiespreparatorytothe 2000Censusindicatedaveryhighqualityofthematches assignedbythecomputer.However,clerkswereallowedtocorrectanyerrorsinthecomputermatchingthattheynoticed,whiletheywereattemptingtomatchthehousing unitsthatwerenotcomputermatched.Theclericalmatchersusedallhousingunitinformationavailabletomatchhousingunits.Theurbanareaswerealmosttotallycity-styleaddresses.Inruralareas,theaddressesweremoredifficulttomatch,mainlybecauseofthenon-city-styleaddresses.Thematchershadhouse-holdernamesandlocationdescriptionstohelpinmatch-ingtheA.C.E.andcensusaddressesinruralareas.The spottedmapsfortheA.C.E.andthecensuswerealsoused inthefinaldeterminationofwhichhousingunitsmatched inruralareas.ComputerimagesoftheA.C.E.andcensus spottedmapsthatwereusedinthehousingunitmatching wereaccessedviathematchingsoftwareandviewedon thescreen.Therewasalsoaclericalsearch,limitedtotheblockclus-ter,forduplicatehousingunitsduringthisphaseofthe matching.Thepossibleduplicateswerelinkedinthedata-baseforboththeA.C.E.andthecensus.Afollow-upinter-viewwasconductedtodetermineifthetwoaddressesreferredtothesamehousingunit.Onegoalforthe2000A.C.E.wasnottouseanypaperintheclericalmatching.Almostallmaterialsneededforclericalmatchingwereavailableonthecomputer.Paper-lessmatchingreducedthetimeneededforclericalmatch-ing,becausethetimespentwaitingforanassignmentandassociatedmaterialwaseliminated.TherewasthusnoneedforalargestafftomaintainanA.C.E.library.Paper mapswereavailabletouseforcaseswheretheimageofthemapwasnotavailableorwasnoteasytoviewinthesoftware.Thequalityassurancewasappliedasfollows:alloftheworkdonebyeachclericalmatcherwasreviewedinitially untilthematcherwasdeterminedtobeperformingatanacceptablelevelofquality.Thenumberofrecordstobereviewedbeforeaclericalmatcherwasclassifiedas acceptablewas200,afterwhichanacceptableclerkhad asystematicsampleofclustersreviewedforqualityassur-ance.Therewasacomputerrecordofthelevelofqualityofeachclerkswork.Iftheworkinthesampleofreviewed clustersfellbelowtheacceptablelevelofquality,allofthesubsequentworkofthatclerkwasreviewedbytechni-cians,untiltheclerkachievedanacceptablelevelofqual-ity,thensamplingwasresumed.Theanalystsperformed thesametypeofqualityassuranceonthetechnicians.Table4-2containstheresultsofbeforefollow-upclericalmatching.Thesenumbersincludeonlythehousingunitsinclustersthatwereprocessedinthehousingunitmatch-ing.Thelist/enumerateclustersarethereforenot included.TherelistedclustersdescribedattheendofthissectionarealsonotincludedinTable4-2.Thecensushadmorepossibleduplicatesandhousingunitsnotmatching thantheA.C.E.Thefollow-upinterviewresolvedthehous-ingunitstatusanddeterminedifthepossibleduplicateswereinfactduplicated.4-4SectionIChapter4A.C.E.FieldandProcessingActivitiesU.S.CensusBureau,Census2000 Table4-2.HousingUnitMatchingResultsBeforeFollow-Up InterviewingA.C.E.CensusHousingunitsPercentHousingunitsPercent Matched.....................681,38581.6681,38579.7Possiblematch | |||
...............29,2313.529,2313.4Possibleduplicate............7350.15,7750.7Notmatched | |||
.................123,46914.8138,65716.2RemovefromA.C.E...........100.0Total........................834,830100.0855,048100.0HousingUnitFollow-UpAllofthecasescodedasnotmatched,possiblymatched,orpossiblyduplicatedweresentforafollow-upinterview, regardlessofthetypeofbasicstreetaddresscode.Selectedmatchedcaseswerealsosenttofollow-uptocollectadditionalinformation.Specifically,thecases identifiedforfieldfollow-upwere: | |||
*A.C.E.addresseswithabeforefollow-upcodeofnotmatched.Informationwasobtainedtodeterminewhethertheaddresseswerehousingunitswithinthesamplecluster. | |||
*Censusaddresseswithabeforefollow-upcodeofnotmatched.Informationwasobtainedtodeterminewhethertheaddresseswerehousingunitswithinthe samplecluster. | |||
*Possiblematches.ThepossiblematchesweresenttothefieldtodetermineiftheA.C.E.andcensusaddresses referredtothesamehousingunit.Iftheydidnot,theywereidentifiedasanA.C.E.nonmatchandacensusnonmatchduringthehousingunitfollow-upandinfor-mationwasobtainedtodeterminewhethertheaddresseswerehousingunitswithinthesamplecluster. | |||
*Possiblecensusduplicates.Censushousingunitsthatwereidentifiedaspossibleduplicateswerefol-loweduptodetermineifthetwocensusaddresses referredtothesamehousingunit. | |||
*PossibleA.C.E.duplicates.A.C.E.housingunitsthatwereidentifiedaspossibleduplicateswerefollowedup todetermineifthetwoA.C.E.addressesreferredtothesamehousingunit. | |||
*Matchedhousingunitswithacodeofundercon-struction,futureconstruction,unfitforhabita-tion,vacanttrailersiteinamobilehomepark, other.Thesematcheswerefolloweduptodetermineiftheyfitthedefinitionofahousingunitatthetimeofthefollow-upinterview.AnA.C.E.housingunitwithunitstatusindicatingsome-thingotherthananoccupiedorvacanthousingunitthat wasintendedforoccupancyneededafollow-upinterviewtodetermineitsstatusatthetimeofthefollow-upinter-view.Theaddresswaseitherclassifiedasahousingunitorremovedfromfurtherprocessing.Forexample,aunitthatwasunderconstructionorfutureconstructionatthe timeoflistingmayhavefitthedefinitionofahousingunitatthetimeofthefollow-upinterview.Iftheunitfitthedefinitionofahousingunit,itwasincludedintheA.C.E. | |||
housingunitprocessing.Ifconstructionhadnotpro-gressedenoughforittofitthedefinitionofahousingunit,itwascodedasremovedfromtheA.C.E.housing unitinventory.Thehousingunitfollow-upformswerecomputergener-ated.Thequestionsforhousingunitsrequiringa follow-upinterviewwereprinted.Inaddition,allhousingunitsintheblockclusterwereprintedforreference.ThequestionsfortheA.C.E.nonmatchesareinFigure4-2.The samequestionswereaskedforthecensusnonmatches.Thequestionsonthefollow-upformwerenotdesignedtobereadtorespondents,butwereintendedtobeusedasaguideforaninterviewer.Indeed,manyquestionswereansweredbyobservation.Theanswertoonequestion mayhavebeentheresultofaskingseveralotherques-tions.Thefollow-upinterviewerappropriatelymodifiedthequestions,whennecessary,tothesituationthatwasencounteredinthefieldandrecordedtheappropriate answersonthefollow-upform.Thisapproachwasadoptedbecausethereweremanysituationsthatcouldoccurandaformtocovereverypossiblesituationwould becumbersometohandle.Itwasnecessarytofindoutifthehousingunitsatisfiedthecensushousingunitdefini-tionatthetimeofthefollow-upinterview.Therewasno attempttogatherinformationaboutreasonsforbeing somethingotherthanahousingunit.Forexample,thefollow-upinterviewerdeterminediftheaddressforanA.C.E.independentlistingnonmatchoracensusnonmatchexistedasahousingunit.Thiswasnotaquestionmeantforarespondent.Therewereseveral reasonswhyanaddressmightnotfitthedefinitionofahousingunit,suchasitburned,itwasamobilehomethatmoved,itwasconvertedtofewerhousingunits,itwas groupquarters,itwasusedforstorageoffarmmachinery, itwasthelaundryroominanapartmentcomplex,itwasaSectionIChapter44-5A.C.E.FieldandProcessingActivitiesU.S.CensusBureau,Census2000 Table4-3.AfterFollow-UpHousingUnitMatchingResultsA.C.E.CensusHousingunitsPercentHousingunitsPercent Matched.................................................719,01386.1719,01384.1Notmatched,butexistedintheblockcluster. | |||
...............76,4189.228,8743.4Didnotexistasahousingunit | |||
.............................30,7703.748,6845.7Geocodedoutsidethecluster | |||
..............................6,3160.845,0535.3 Duplicate................................................1,1570.112,2961.4 Unresolved | |||
..............................................1,1560.11,1280.1Total....................................................834,830100.0855,048100.0business,andsoforth.Theinterviewerappropriatelymodifiedthequestions,asnecessary,tothesituationthatwasencounteredinthefield.Furthermore,theinterviewercouldidentifymatchesorduplicatesinthefieldthathad notbeenidentifiedintheclericalmatching.AdditionalmatcheswerealsoidentifiedbetweentheA.C.E.andcensusaddressesduringthefollow-upinter-view,whentheinterviewerrealizedthetwodifferent addressesintheA.C.E.andcensusreferredtothesame unit.Correctionsandupdatestotheaddresseswerealsorecordedonthefollow-upform.TheaddressupdateswerekeyedintothedatabasetoaccuratelyidentifyA.C.E. | |||
housingunitsforthepersoninterviewing.Thefollow-upinterviewerswereinstructednottoaddhousingunitsmissedbyboththeA.C.E.andcensusforthe2000A.C.E.AfterFollow-UpCodingAfterthefieldfollow-up,thecompletedformswerereturnedtotheprocessingoffice.Usingtheinformation obtainedduringthefieldwork,anafterfollow-upmatchcodewasassignedbytheclericalmatchersforcasessenttothefield.Thetechniciansandanalystsreviewedthe clusterscontaininghousingunitswithareviewcodeandcarriedoutqualityassurancefortheclustersprocessedintheafterfollow-uphousingunitmatching.Thefollow-upformswerereviewedclericallyandcodeswereassignedtotheA.C.E.andcensushousingunits.Table4-3provideshousingunitmatchingresultsforallA.C.E.andcensushousingunitsafterthefollow-upinter-viewcodeswereassigned.A.C.E.housingunitsclassifiedasexistingintheblockclusterandhousingunitswithunresolvedhousingunitstatuswereeligibleforperson interviewing.Thisincludedbothmatchedandnotmatchedunits.A.C.E.addressesclassifiedasnothousingunits,duplicates,andgeocodingerrorswereremoved fromtheA.C.E.universe,andtherefore,werenoteligible forpersoninterviewing.ThenumbersinTable4-4aretheA.C.E.housingunitsthatwereeligibleforpersoninter-viewingbeforesamplereduction.Thesenumbersdonot includetherelistedclustersandclustersinlist/enumerateareas.Censushousingunitswithcodesofnotmatchedandunresolvedstatuseswerenoteligibletobeincluded inthePsampleforinterviewingbecausetheywerenot listedintheA.C.E.independentlisting.Table4-4.A.C.E.HousingUnitsEligibleforPersonInterviewing A.C.E.HousingunitsPercent Matched..............................719,01390.3Notmatched,butexistedintheblock cluster..............................76,4189.6 Unresolved | |||
...........................1,1560.1Total.................................796,587100.0RelistingforClusterswithA.C.E.GeocodingErrorsThefollow-upoperationalsoexaminedpotentialgeocod-ingerrorsintheoriginalA.C.E.housingunitlistings.Ifa largeproportionoftheA.C.E.housingunitsintheclusterhadwronggeocodes,theclusterwasrelisted.Clusterswereidentifiedforrelistingwhentheafterfollow-upcod-ingdescribedintheprevioussectionwascompleted.The decisiontorelistwasautomated.If80percentofthehousingunitsinaclusterhadgeocodingerror,theclusterwasrelisted.Therewere62relistedclustersinthe50 statesandtheDistrictofColumbia.Thefieldlisterforrelistedclustershadnopreviouscontactwiththiscluster.Therelistingoperationwascarriedoutindependentlyofthelistofcensushousingunits.Toassureindependence,theA.C.E.housingunitlistings(boththeoriginallistingandtherelisting)weredonewithouttheA.C.E.listersee-ingthecensusinventoryofhousingunits.Therewasnohousingunitmatchingintherelistedclus-tersduringthehousingunitmatchingphaseofA.C.E.The addresseslistedforA.C.E.duringtherelistingoperationweretheaddressesusedtoconductpersoninterviewing.Theseclustersweretreatedinthesamewayasthe list/enumerateclustersin2000.AnunresolvedcodewasassignedtoalloftheA.C.E.hous-ingunitsintherelistedclustersandinthelist/enumerateclusters.Thecensushousingunitsintheseclusterswere assignedablankhousingunitcode.PERSONINTERVIEWPriortopersoninterviewingtherewasanotherstageofsampling,thewithinblocksubsamplingoflargeblock4-6SectionIChapter4A.C.E.FieldandProcessingActivitiesU.S.CensusBureau,Census2000 clusters.SeeChapter3formoredetails.TheresultinghousingunitsfromA.C.E.comprisedtheP-samplehousing unitsassignedforinterviewing.Therewere11,303clus-tersselectedforinterviewing,andtheycontained300,913 P-samplehousingunits.Thepersoninterviewtraining lastedfivedays.A.C.E.moverandresidencestatuscodesnecessarytoidentifyP-samplepeoplefromthepersonintervieware assignedwithintheinterviewinstrument.ThesecodesaredescribedinFigure4-3.ThegoaloftheinterviewwastoobtainahouseholdrosterforeveryonelivingatthehousingunitatthetimeoftheinterviewandonCensusDay,April1,2000.ProcedureCwasusedforthe2000A.C.E.WithProcedureC,each A.C.E.personwasassignedanA.C.E.movercode,anA.C.E.bornsinceCensusDaycode,anA.C.E.groupquar-terscode,andanA.C.E.otherresidencecode.TheA.C.E. | |||
statuscodecombinedalloftheinformationfromthesecodestoidentifythepeopleforwhommatchingwasnec-essary.Attachment1containsthedefinitionsforcodesfor themovers,thosebornsinceCensusDay,membersof groupquarters,otherresidencecode,andtheA.C.E.statuscode.SeetheChapter7attachmentformoreonProcedureC.GroupquarterswerenotlistedintheA.C.E.andA.C.E.interviewswerenotconductedingroupquarters.SeeAttachment2foradiscussionofthetreatmentofgroup quartersinA.C.E.ModeofInterviewTheA.C.E.personinterviewwasconductedusingaCAPIinstrumentonlaptopcomputers.Attachment3containsadescriptionoftheproceduresfollowedintheperson interview.Somepersoninterviewswereconductedbytele-phoneandsomebypersonalvisit.Togetanearlystartfortheinterviewing,atelephoneinterviewwasconductedathouseholdswherethecensusquestionnaireincludedatelephonenumberandwasreceivedatacensusprocessingofficeearlyenoughfor computerprocessing,beforethestartofpersoninterview-ing.Thetelephonenumbercamefromthecensusques-tionnaireofthematchingcensushousingunit.Thepersoninterviewsconductedbytelephonewereconductedfrom April24,2000untilJune13,2000.SeeByrneetal.(2001)formoredetails.Atotalof88,573interviewsor29.4per-centofthetotalworkloadwereconductedbytelephone.ThefollowingcaseswereexcludedfromtheA.C.E.tele-phoneinterviewing:*Housingunitsincensuslargehouseholdandcensuscoverageeditfollow-up*Questionnairesthatwerenotreturnedbymail*Housingunitswithouthousenumberandstreetnameaddresses*Housingunitsinsmallmultiunitstructures(i.e.,lessthan20units)Largemultiunitswereabletobeincludedinthetelephoneinterviewing,becausetheytendedtohaveuniqueunit designations.Manysmallmultiunitstructuresandruralareasdidnothaveaddressesthatallowthetelephoneinterviewertodistinctlyidentifytheaddress.Sincethere wasnohousingunitmatchinginrelistedandlist/enumerateclusters,allpersoninterviewinginrelistedandlist/enumerateclusterswasbypersonalvisit.Allremaininginterviewsaftertheendofthetelephoneoperationwereconductedinperson,exceptforsomenon-responseconversionoperation(NRCO)interviewsand interviewsingatedcommunitiesorsecuredbuildings.Thepersoninterviewsconductedbypersonalvisitwerecon-ductedfromJune18untilSeptember11,2000.Crew leadersandsupervisorsconductedtelephoneinterviewstogivethemexperienceininterviewing.Table4-5containsthenumberofinterviewers,crewlead-ers,andsupervisorsusedduringproductioninterviewingandduringtheinterviewingforpersonfollow-upaftertheclericalmatching.Table4-5.FieldInterviewPersonnelTelephone interview Personal interview Person follow-up Interviewers....................4504,5024,470Crewleaders...................794836712 Supervisors....................189186184Forthefirst3weeksofinterviewing,thepersoninterviewwasconductedonlywithahouseholdmember.Ifaninter-viewwithahouseholdmembercouldnotbecarriedout within3weeks,aninterviewwithaknowledgeablenon-householdmemberwasattempted,calledaproxyinter-view.Theproxyinterviewingwasallowedduringthe remainderoftheinterviewingperiod.Duringthelast2 weeksofinterviewingforacluster,anonresponseconver-sionoperationwasconductedforthenoninterviewsusingthebestinterviewers.Thisnoninterviewconversion attemptedtoobtainaninterviewwithahouseholdmem-beroraknowledgeableproxyrespondent,butnotalastresortinterview 1.Thenonresponseconversionoperationconverted9,518ofthe9,735totalnoninterviewstointer- | |||
views.1Lastresortinterviewswereoneswithminimalinformation,suchasnameslikeWhiteFemale.Thelastresortinterviewisusuallynotfromaknowledgeableproxyrespondent.Lastresortinterviewswereconductedinthecensusattheendofnonre-sponsefollow-up,afterallattemptstocontactaknowledgeable respondenthavenotobtainedaninterview.Lastresortinterviews werenotconductedforA.C.E.SectionIChapter44-7A.C.E.FieldandProcessingActivitiesU.S.CensusBureau,Census2000 TheQuestionnaireTherewerethreepathsorsectionswithinthepersoninter-view.Aninterviewwasconductedusingthefirsttwo paths,whenatleastoneofthehouseholdmembers,for whominformationwasrequired,currentlylivedatthe housingunitwhentheinterviewwasconducted.Onepath collecteddatafromahouseholdmember,andanother pathcollecteddatafromanonhouseholdmember(i.e., | |||
proxyrespondent)forthesepeople.Thereweretwo paths,becausethequestionswerewordeddifferentlyfor interviewswithhouseholdmembersandwithproxy respondents.Theinterviewsfromthefirsttwopathswere inhousingunitscontaining:*Wholehouseholdnonmovers | |||
*Wholehouseholdinmovers | |||
*Householdswithamixtureofnonmovers,inmovers,andoutmoversThethirdpathwasforwholehouseholdoutmovers.Thedataforoutmoverswasobtainedbyproxywiththecur-rentresidentinthesamplehouseholdorwithotherproxyrespondents,whennecessary.Whentherewasaninter-viewwithwholehouseholdinmovers,therewasalsoan interviewusingthethirdpathforwholehousehold outmovers.Whenthereweremultipleinterviewsforthesamehousingunit,theCAPIdatafromthelastinterviewwasselectedforprocessing.Iftherewasalsoaqualityassuranceinterviewthatreplacedtheoriginalinterview,thequalityassurance interviewwasselectedoveranyotherinterview.Aftertheinterviewersobtainedthenamesandcharacteris-ticsofhouseholdmembers,theyestablishedtheresidencestatusonCensusDay.Fornonmoversandoutmovers, moverstatusinadditiontoquestionsaboutgroupquar-tersandotherresidencesonCensusDayestablishedtheresidencestatus.CollegestudentslivingelsewhereindormitorieswerenotpartoftheA.C.E.universe.However,theywereinadvert-entlyincludedasinmoversintheA.C.E.instrument.Tocorrectforthis,aneditwasperformedforpartialhouse-holdinmoverswhowereingroupquartersonCensusDay. | |||
IftheinmoverwasingroupquartersonCensusDayand wasbetweentheagesof18and22,inclusive,theinmoverwasgivenanA.C.E.statuscodeofremoved.QualityAssuranceofPersonInterviewingThequalityassuranceplanfortheA.C.E.PersonInterviewoperationconsistedofareinterviewofasampleoftheoriginalA.C.E.interviews.Theworkloadconsistedofapreselectedrandomsampleof5percentofthetotalper-soninterviewcaseloadandanothersampleconsistingofcasestargetedbythesupervisorsintheregionalofficesusingspeciallydesignedtargetingreports.Thetargetingwasbasedonvariousindicatorslikelytopredictpoordataqualityorpotentialfabrication.Thetargetedsamplewas another5percentofthetotalworkload.AseparateCAPIquestionnairewasdesignedforthequal-ityassuranceinterviews.Thequalityassurancequestion-nairecontainedseparatepathsfortelephoneandpersonalvisitqualityassuranceinterviews.Thequestionnairealsoincludedacompleteversionoftheoriginalinterviewto allowqualityassuranceinterviewerstoconductthehouseholdinterviewoncasessuspectedoffabrication. | |||
Consequently,itwasnotnecessarytoassignanotherfieldrepresentativeatalaterdatetoconductthehouseholdinterviewsforsuchcases.Qualityassuranceinterviewswereconductedeitherbytelephoneorpersonalvisit.Theinterviewdetermined whetherornottheoriginalrespondentwascontactedbyaninterviewer.If,afteraninitialsetofquestions,itappearedthattherespondenthadnotbeenpreviously contacted,thequalityassuranceinterviewcontinuedwithafullhouseholdinterviewthatreplacedtheoriginalinter-viewinallfutureprocessing.Thequalityassuranceplancenteredonwhethertheorigi-nalintervieweractuallycontactedthepersonwhowasreportedtohavebeeninterviewed.Whenthiswasthecase,theinterviewitselfwasassumedtobecorrect because,thepersoninterviewquestionnairewasdesignedtoensuredataqualityusingdataeditsandautomatedquestionnaireskippatterns.Whenthiswasnotthecase (i.e.,theproperhouseholdwasnotcontacted),afullrein-terviewwasconducted.Thequalityassuranceplanwasdesignedtobemosteffec-tiveforthefewinterviewerswhoblatantlyincludedata fromfictitiousinterviews.Thisoccursinpracticeinsimilar surveys.Therefore,discrepantresultsweretargetedbylookingforinconsistentorconspicuousresultsidentifiedusingthetargetingreports.Examplesofinconsistentor conspicuousresultsincludeusingthesamenameforrespondentsacrosscases,usingfamousnamesforhouse-holdmembers,orcompletingcasestoolateinthedayto reallyhavebeeninterviewingatsomeoneshouse.Effectivelyidentifyinganinterviewerwithonlyoneortwoerrorsinalargeworkloadofcaseswouldrequireapro-hibitivelylargerandomsample.Because,laterA.C.E. | |||
operationssuchasthepersonfollow-upinterviewwereexpectedtoidentifysuchcases,thequalityassuranceplandidnotattempttoidentifythesesituationsbeyond whatfallsinthe5percentrandomsample.PreliminaryEstimationOutcomeCodesPreliminaryP-sampleestimationoutcomecodeswereassignedtoeachP-samplehousingunitbeforethecom-puterandclericalmatching.Thisoutcomecodewas4-8SectionIChapter4A.C.E.FieldandProcessingActivitiesU.S.CensusBureau,Census2000 assignedtothehousingunitbasedonCensusDayfornonmoversandoutmovers.Onlypeoplewiththe followingA.C.E.statuscodeswereusedinthematching | |||
operations:*N=nonmoverresident*O=outmoverresident | |||
*U=unresolvedresidencestatusThepreliminaryestimationoutcomecodesidentifiedinterviewsandnoninterviewsinoccupiedhousingunits,vacanthousingunits,andhousingunitsthatwereremovedfromthePsample.Theinterviewoutcomes describedinthissectionwereCensusDayinterviewoutcomesafterdataediting,whichconvertswholehouse-holdsofCensusDayresidentswithinsufficientinforma-tionformatchingtononinterviewsandwholehouseholdsofCensusDayresidents,whoshouldnothavebeencountedatthehousingunitonCensusDaytovacant housingunits. | |||
Interviews*Completeinterviews.Interviewsconductedwithahouseholdmember.*Proxyinterviews.Interviewsconductedwithsomeoneoutsidethehousehold.*Sufficientpartialinterviews.Interviewswithhouseholdmembersorproxiesthatdidnotcollectallrequired data,butdidcollectenoughinformationtobeconsid-eredasinterviews. | |||
Noninterviews*Fieldnoninterview. | |||
*Wholehouseholdsofpeoplewithinsufficientinforma-tiontopermitmatchingandfollow-up.VacantonCensusDay | |||
*HousingunitsidentifiedasvacantonCensusDaybytheinterviewer.*WholehouseholdsofpeoplewhoshouldhavebeencountedelsewhereonCensusDay(i.e.,wholehouse-holdnonresidents).NotaHousingUnitonCensusDay | |||
*ThehousingunitsidentifiedduringthepersoninterviewasnotahousingunitonCensusDaywereremovedfromthePsample.Table4-6containsthenumberofeachcategoryofprelimi-naryoutcomecodesandthenumberandpercentagesof totaloccupiedandvacanthousingunitsfortheprelimi-naryoutcomecodesgroupedintointerview,noninterview,andvacant.Thepercentagesofinterviewandnoninter-viewforoccupiedhousingunitswerealsoincluded.Thenoninterviewrateforoccupiedhousingunitswas1.9per-centbasedonthepreliminaryoutcomecodesbefore clericalmatching.Theinterviewersidentified10,206 addressesor3.4percentoftheA.C.E.addressesasnotbeinghousingunitsonCensusDay.TheA.C.E.housingunitsidentifiedassomethingotherthanhousingunits werenotinthePsample.FormoredetailsseeChildersetal.(2001).Table4-6.PreliminaryCensusDayEstimationOutcomeforA.C.E.HousingUnits(Unweighted)OutcomecodeTotalhousingunitsOccupiedhousingunitsNumberPercentNumberPercent Interview............................................................257,62488.6257,62498.1Completeinterviewwithahouseholdmember. | |||
........................ | |||
235,632Completeinterviewwithaproxyrespondent | |||
.......................... | |||
19,380Sufficientpartialinterview | |||
........................................... | |||
2,612 Noninterview | |||
........................................................4,9881.74,9881.9Fieldnoninterview | |||
................................................. | |||
2,667Allpeoplehaveinsufficientinformationformatchingandfollow-up | |||
.......2,321Totaloccupiedhousingunits | |||
...........................................262,612100.0Vacant..............................................................28,0959.7NoCensusDayresidents | |||
........................................... | |||
4,184VacantonCensusDay | |||
.............................................23,911Totaloccupiedandvacanthousingunits | |||
................................290,707100.0NotahousingunitonCensusDay | |||
..................................... | |||
10,206Totalinterviewedhousingunits | |||
........................................ | |||
300,913SectionIChapter44-9A.C.E.FieldandProcessingActivitiesU.S.CensusBureau,Census2000 Thepercentnoninterviewwascalculatedfortheunweightednumbersofnoninterviewsdividedbythe occupiedinterviews,whichwastheinterviewsplusthe noninterviews.Tablesofpreliminarynoninterviewrates arepresentedforrespondenttypeandinterviewmodein Tables4-7and4-8.Table4-7.P-SamplePreliminaryPercentNoninterviewinBeforeFollow-Up byRespondentTypeRespondenttypeP-samplepreliminarypercentnoninterviewHouseholdmember.......................0.9Proxy................................... | |||
13.8Total....................................1.9Ofallinterviewsatoccupiedhousingunits,33.5percentwerecompletedbytelephone,66.1percentwerecom-pletedbypersonalvisit,and0.3percent,whichwas910 interviews,werecompletedbyaqualityassurancereplacementinterview.Thepercentnoninterviewofoccu-piedhousingunitsforeachinterviewmodeisshownin Table4-8.Table4-8.P-SamplePreliminaryPercentNoninterviewBeforeFollow-Upby InterviewModeInterviewmodeP-samplepreliminarypercentnoninterviewTelephone...............................0.9Personalvisit.............................2.2Qualityassurancereplacement | |||
............. | |||
36.0Total....................................1.9Whiletelephoneinterviewsweremorelikelythanpersonalvisitinterviewstohaveinsufficientinformationbecause neighborscouldnotbecontacted,thiswasoffsetbythestraightforwardnatureofthetelephoneinterviews.Thesewerecaseswheretherespondentcompletedandreturned thecensusforminatimelymannerandprovidedatele-phonenumberontheform.Conversely,personalvisitcasestendedtobethemoredifficultsituations(suchas moversorreluctantrespondents),andweretherefore, muchmorelikelytoresultinnoninterviews.Therewereseveralreasonsforahighnoninterviewrateforthequalityassurancereplacementinterviews.These weredifficultinterviews,becausetheyfailedthequality assurancecheckandneededareinterview.Manyofthenoninterviewswererefusals.Additionally,becausetheinstrumentwasmonitoringboththequalityassurance caseandthereplacementinterview,itwasdifficulttoobtaintheCensusDayresidentsinmovercasessothatmanyofthesewerenoninterviews.Therewasalsoaprob-lemwiththeinstrumentincaseswherethequalityassur-anceinterviewercouldnotfindtheaddressonthedayoftheQAinterview.Whenthisoccurred,thecasefailedthequalityassurancecheck,butnodatawerecollectedtoreplacetheoriginalinterviewsincetheQAinterviewer couldnotfindtheaddress.However,unlikeinpersonal visitcases,noattemptwasmadebytheQAinterviewerto determineifthesampleaddressalsodidnotexistonCen-susDay.Therefore,thesecaseswereconsideredtobe CensusDaynoninterviews.Therewere108suchcases.PERSONMATCHINGAfterboththeCAPIinterviewingandtheHCUFwerecom-pleted,theEsamplewasidentifiedfromtheHCUFandpersonmatchingbegan.PeoplewithincompletenameswereidentifiedbycomputerforboththePandEsample, becausetheydidnotcontainsufficientinformationformatchingandfollow-up.SeeAttachment4formoreinfor-mationaboutcensusdata-definedandinsufficientinfor-mationformatchingandfollow-up.TheP-samplepeopleandthoseintheHCUF,withinthesampleclusters,werecomputermatched.Thepossiblematches,P-samplenonmatches,andE-samplenonmatches wereclericallyreviewedusinganautomatedmatchingandreviewsystem.Additionalmatchesandpossiblematcheswereidentifiedbytheclericalstaff.Duplicatesonboth listswerealsoidentifiedclerically.Afterthematchingwas completed,fieldfollow-upwasconductedandtheresultsofthefieldinterviewwerecodedinthematchingdata-base.WithinBlockClusterComputerandClerical MatchingWithprocedureC,thepeopleinP-samplehousingunits,whowereinitiallymatchedtotheE-sampleand non-E-samplecensusenumerationswere:*nonmoversandoutmoversidentifiedasresidents(i.e.,A.C.E.statusequaltoNandO),or*peoplewithunresolvedresidencestatus(i.e.,A.C.E.statusequaltoU)Thematchingwithinthesampleclusterswasdonebythecomputermatcherfollowedbyacomputerassistedcleri-calreview.ThecomputercomparedthenonmoversandoutmoverstotheE-samplecensusenumerationsinsampleclustersandwhennecessarytothenon-E-sampleenu-merations.Thesenon-E-sampleenumerationswerecensus peopleinhousingunitsthatwerenotincludedintheEsampleafterthesubsamplingofcensushousingunits.Theclericalmatchersalsosearchedamongpeopleenu-meratedinthecensusingroupquarters.AmatchwasassignedwhenthenameandcharacteristicsinthePsampleforapersonwerefoundinthecensusdatawithin theblockcluster.Duringcomputermatching,thePsamplewasmatchedtothecensus.However,thismatchingwasprioritized;first4-10SectionIChapter4A.C.E.FieldandProcessingActivitiesU.S.CensusBureau,Census2000 thePsamplewasmatchedtotheEsample,thenanyleft-overnonmatchesfromthePsamplewerematchedtothe non-E-samplepeopleinhousingunits.Thematching occurredintwosteps: | |||
*RecordPairRanking.ThestandardizednamesfromtheP-samplepersonandthecensuspersonwerecom-paredalongwiththepersoncharacteristicsusinga stringcomparison(Winkler,1994).Arankingscorewasassignedtoeachpairofpeopleandtheoptimalpairs wereidentified. | |||
*DeterminationofMatchCutoffs.Theoptimalpairsintheclusterwerereviewedtodeterminethecutoffsformatchesandnonmatches.Allpairsabovethematch cutoffwereidentifiedasamatch.Allpairsbetweenthematchcutoffandnonmatchcutoffwereidentifiedaspossiblematches.Allpairsbelowthenonmatchcutoff wereclassifiedasnotmatched.Matchcutoffswereassignedconservativelytopreventfalsematches.Thegoalofthematchingandfollow-upoperationwastoproducethecorrectratioofcasesclassifiedasomittedfromthecensustothoseclassifiedascorrectlyincludedin thecensus.Afterthecomputermatching,P-sampleandE-samplepeoplewhodidnotmatchwerereviewedcleri-cally.Theclericalmatcherswereabletomatchpeoplethe computercouldnot,becausetheyhadthewholehouse-holdtoaidinmatching.TheP-samplenonmatchesweresearchedforinthecensus.Aduplicatesearchwasalsoconductedclerically.Thematchingandduplicatesearch wasaidedbythesoftwareinsortingandsearchingthecensusrecords.ThecomputerassistedclericalmatchingsoftwarecontainedallA.C.E.andcensusinformation aboutP-sampleandcensuspeople,includingnames,char-acteristics,outcomeoftheinterview,andaddress.TheA.C.E.technicianscarriedoutthequalityassurancefortheclericalmatchersandresolvedthecasesflaggedbytheclericalmatchersasneedingfurtherreview.TheA.C.E.analystsdidthequalityassuranceforthetechniciansand resolvedthecasesflaggedbythetechniciansasneeding furtherreview.Therewere235clerks,46technicians,and10analyststodotheclericalmatching.CensusImages.Scannedimagesofcensusquestion-naireswereavailableformatchingforthefirsttimein Census2000.Theclericalmatchersusedtheseimagesasanaidinmatchingandwhenadditionalinformation(likenames)wasfound,thenewinformationwasmadeavail-ableforthefollow-upinterview.AnE-samplerecordcouldbeupdatedbytheclerkstoprovidesufficientinformationformatchingandfollow-uportocorrectimagecapture errors.Inaddition,someinformationwrittenoutsidethecaptureboxeswasusedtoupdatethedata.ForCensus2000,allcensusformswerescannedandthesubsequentinformationwasinterpretedusingOptical MarkRecognitionandOpticalCharacterRecognitionorwaskeyed.Forpersonmatching,imageswereonlyavail-ableforhousingunitsontheJanuary,2000DMAF.Images werenotavailableforcensushousingunitsaddedafter January,2000.Anaddressidentifiedbycensusidentifica-tionnumber(ID)couldreturnmorethanoneform,includ-ingthefollowing:originalcensusform,BeCountedform, aforeignlanguageform,and/oraSimplifiedEnumerator Questionnaire.BeCountedformswerenotavailabletouse forviewingimages,sincetheydidnothaveacensusID associatedwiththeformwhendatacaptured.Theclericalmatchersrevieweddataforcensuspeoplewithinsufficientinformationformatchingandfollow-upandsearchedforadditionalinformationthatmightallowthemtobematchedwhentheimagewasavailable.All reviewofcensuspeoplewithinsufficientinformationfor matchingandfollow-upwasdonebeforetheclericalmatchingbeganandthecensusdatainthematchingsoft-warewasupdated.Thesoftwaredidnotpermitthe assignmentofacodeuntilthereweretwocharacteristicsandacompletename.Afterthesoftwaredatawereupdated,theclericalmatchingprocessbeganandthe matcherscouldmatchtheP-samplepersontothecensuspeoplenowcontainingsufficientinformationformatch-ing.Thematcherswerealsoabletoreviewdatafornon-matcheswhentheysuspecteddatacaptureerrorsandtocorrecttherecordsofname,relationshiptopersonnum-berone,sex,age,Hispanicorigin,andrace.Thecorrecteddatawereusedonthefollow-upform,butnotsenttoestimation.TheupdateddatawerenotinsertedintotheHCUF.Thisupdatingwasformatchingin A.C.E.andforthefollow-upformonly.Thematcherswere | |||
NOTlookingatpeoplewhowerenotdata-definedtoseeiftherewasmoreinformationonthecensusformtomakethemdata-defined.Therefore,peoplewere NOTcreatedinthecensus.DuplicateSearchWithinCluster.Thesearchfordupli-cateswasdoneclerically.Apersonwasduplicatedwhenthedatacollectedforthepersonwasrepeatedwithintheblockcluster.Theprintoutsusedin1990forduplicate searchwereautomatedin2000.Searchroutinesinthe2000clericalmatchingsoftwaremadethesearchesquickerandmoreaccurate.Duplicateswerelinkedinthe matchingsystemforlateranalysis.DuplicatedPeopleWereIdentified: | |||
*WithinthePsample.AduplicatedP-samplepersonwasremovedfromthefinalPsample,becauseboth peoplewerenotneededinthathouseholdinthePsample.WhenthewholehouseholdsofP-samplepeoplewereduplicated,oneofthehousingunitswascon-vertedtoanoninterviewbecausetheinterviewwasnotagoodone.TheduplicatedP-samplehouseholdwasinadifferenthousingunitandoneofthemwasincluded insteadofthepeoplewhoactuallylivedattheaddress.SectionIChapter44-11A.C.E.FieldandProcessingActivitiesU.S.CensusBureau,Census2000 Forexample,theSmithfamilywascollectedinapart-mentsAandB.Bothapartmentswerehousingunits. | |||
TheP-sampleinterviewfortheduplicatedfamilyisnota goodinterviewandisconvertedtoanoninterviewafter theP-samplepeoplewereremoved. | |||
*WithintheEsample.AnE-samplepersonduplicatewasanerroneousenumerationinthecensus. | |||
*BetweenE-samplepeopleandpeoplenotintheE sample.TheE-samplepeoplewerealsocomparedtothecensuspeopleinhousingunitswithinthesamesampleclusterwhowerenotinsampleinlargeblock clustersaftertheE-sampleidentification.TherewasnoduplicatesearchbetweenE-samplepeopleandpeopleenumeratedingroupquarters.Also,therewasnodupli-catesearchwithingroupquarters.WhenduplicationbetweenanE-samplepersonandanon-E-samplepersonwasidentified,itindicatedthat therewasnotafullerroneousenumeration.Therefore,theprobabilityoferroneousenumerationcausedbyduplicationwasneededfortheduplicatedE-sampleper-son.Theformulafortheprobabilityoferroneousenu-meration,was100times ddividedby c+d+1percentor P r (EE)100d/(c+d+1)percentwhere c=numberoftimestheE-samplepersonwasduplicatedwithanotherE-sampleperson d=numberoftimestheE-samplepersonwasduplicatedwithanon-E-samplepersonIn1990,whentherewasduplicationbetweenapersonintheEsampleandapersoninahouseholdthatwasnotin thelarge-clustersubsample,andthereforenotintheE sample,theE-samplepersonwasassignedaprobabilityoferroneousenumerationofonehalf.Thismethodologywasrefinedinthe2000A.C.E.toaccommodatetriplicates.The 1990estimatewasbiasedwhentherewasatriplicateenu-merationinthecensusandthistriplicateinvolvedtwoE-sampleduplicatesandthetriplicatewasnotintheE sample.However,therewereonlyafewofthesecasesin 2000.ThisassumestheE-samplepersonhadbeencodedascor-rectlyenumerated.IftheE-samplepersonwascodedunre-solved,thefinalprobabilityoferroneousenumeration includedanimputationforunresolvedenumerationstatus.IftheE-samplepersonwasassignedamatchcodethatindicatederroneousenumeration,thenumberoftimes thattheE-samplepersonwasduplicatedwithnon-E-samplepeoplewasirrelevantandignored.Apersoncouldnothaveaprobabilityoferroneousenumerationthatwas largerthan100percent.CensusGeocodingErrorsTheclericalmatchersreviewedpeopleincensushousingunitsidentifiedinthehousingunitmatchingasgeocodingerrors.TheclericalmatchersassignedacodeindicatinggeocodingerrortoE-samplepersonsforwholehousehold E-samplenonmatches.Therewasnoneedforafollow-up interview,sincethehousingunitfollow-upoperationiden-tifiedthesehousingunitswithgeocodingerrors.These E-samplepeoplewereerroneouslyenumeratedinthis sampleclusterbecausetheywereenumeratedinahous-ingunitthatwasincorrectlygeocodedtothissampleclus-ter.In1990,thesepeoplewerefollowedupbecauseit wasntclearwhowasincorrectlygeocodeduntilafterthe follow-upinterview.CodingNonmatchesinLargeHouseholdsThemailreturnshortformhadacontinuationrostertocollectnamesforpersonsseventhroughtwelve.Themailreturnlongformhadarosterforthenamesofpersonsonethroughtwelve.Datawerecollectedforthefirstsix peopleinthehousehold,forbothlongandshortforms.If thelargehouseholdfollow-upwasunsuccessful,therewereonlynamesforpersonsseventhroughtwelveforthelongandshortmailreturnforms.Censusrecordswerenot createdforthepeopleinhouseholdswithonlynames,sincetheywerenotdata-defined.ThenamesontherosterswereusedtoreducetheP-samplefollow-upofnonmatchesinlargehouseholds.P-samplepeopleinlargehouseholdswhowerefoundon thelargehouseholdrosterwerenotfollowedupbecausetheywereresidentsofthehousingunitonCensusDay.Theywerestillcountedasnotmatchedtoacensusenu-meration,butafollow-upinterviewwasnotneededto establishtheirresidenceonCensusDay.TargetedExtendedSearchP-samplewholehouseholdnonmatcheswithnoaddressmatchandE-samplewholehouseholdsofnonmatchedpeopleinhousingunitscodedasgeocodingerrorshad theirsearchareaexpandedintothefirstringofsurround-ingblocks.Theexpandedsearchisreferredtoastargetedextendedsearch(TES).SeeChapter5forafulldiscussion.Thetargetedextendedsearchfor2000A.C.E.wasatwo-stageprocess.First,clusterswereidentifiedthatwouldbenefitmostfromexpandingthesearchareatosurround-ingblocks.Second,blockswithinthesurroundingblocks weretargetedforsearching.Thisextendedsearchwastargetedattheclustersmostlikelytobenefitfromexpandingthesearcharea.Theclus-tersselectedfortargetedextendedsearchforthe2000AccuracyandCoverageEvaluationwere:*Clustersincludedwithcertainty*RelistedclustersinA.C.E.*The5percentofclustershavingthemostunweightedcensusgeocodingerrorsandA.C.E. | |||
addressnonmatches4-12SectionIChapter4A.C.E.FieldandProcessingActivitiesU.S.CensusBureau,Census2000 | |||
*The5percentofclustershavingthemostweightedcensusgeocodingerrorsandA.C.E.addressnon- | |||
matches*ClustersselectedatrandomfromtheclusterswithA.C.E.housingunitnonmatches(i.e.,A.C.E.housingunitscodedCIorUI)orcensushousingunitsidentified asgeocodingerrors(i.e.,codedGE)Theclustersnotselectedfortargetedextendedsearchwere:*ClustersnotselectedfromtheclusterswithA.C.E.hous-ingunitnonmatches(i.e.,A.C.E.housingunitscodedCIorUI)andcensushousingunitsidentifiedasgeocodingerrors(i.e.,codedGE),(i.e.TESeligibleforsampling,but notselected)*ClusterswithnoA.C.E.housingunitnonmatchesorcen-susgeocodingerrorsidentifiedinthehousingunit matching.*List/EnumerateclustersTable4-9containsthenumberofclustersselectedforTESandthenumberofP-sampleandE-samplepeopleinTES.Thenumberofclustersincludestheclustersincludedwith certaintybecausetheywererelisted.P-samplepeoplewitharesidenceprobabilityofzerohavebeenexcludedfromthetable.Table4-9.TheTESSample Clusters P-sample people E-sample peopleIncludedwithcertainty | |||
...........1,15028,53320,572SampledforTES | |||
...............1,0893,8892,281Total...........................2,23932,42222,853ClusterswiththemostunweightedandweightedcensusgeocodingerrorsandA.C.E.addressnonmatcheswere includedbecausesomeclusterswithlargeweightscon-tributedisproportionatelytotheestimates.Approximately10percentoftheclustersoftheremainingclusterswith A.C.E.housingunitnonmatchesandcensusgeocodingerrors(49percentofallclusters)wereselectedatrandom.Therewere2,239clustersselectedfortargetedextended search.Inthesecondstageoftargeting,theworkwastargetedtoblockswithinthesearchareawherethegeocodingerrorwaslocated.In1990,theeffortrequiredtosearchfor matchesandduplicatesinlargeareasthathadonlyafew possiblematchesorduplicatesappearedtoleadtoerrors.Therewasanecdotalevidenceofclerkswhodidnotbothertolookinsurroundingblocksbecausetheyrarely foundanything.Targetingtheexpandedsearchingprob-ablyreducedclericalerrors,aswellasthecostofthe operation.P-SampleMatchingExtendedSearchThesearchareawasexpandedtoclericallysearchtheringofsurroundingblocksfortheP-samplewholehousehold nonmatches,whenahousingunitwasnotamatchin housingunitmatching,(i.e.,thehousingunitmatchcode wasanonmatchorunresolved).Therewasnosearchingin surroundingblocksforpartialhouseholdnonmatchesor forwholehouseholdnonmatcheswithmatching addresses.Howthesearchwasdonedependedonwhethertheclus-teranditssurroundingblocksconsistedsolelyofurbantypeaddresses,orwhethertheyconsistedofsomeorall ruraltypeaddresses.*Inareasthatarecompletelyurban,iftheclerklocatedthebasicstreetaddressinthesurroundingblocksortheclerkdeterminedtherangeofaddresseswasinthe surroundingblocks,personmatchingwasconductedin thatblockwherethebasicstreetaddressorrangewaslocated.Thematchingwasalsoconductedwhentherewasapossibleaddressmatchinasurroundingblock.*Inruralormixedurbanandruralareas,becauseofthedifficultiesinmatchingruraltypeaddresses,therewasnoattempttomatchaddressesinthesurroundingblocks.Instead,peopleweresearchedforinallofthe surroundingblocks.E-SampleExtendedSearchforGeocodingErrorsAcensuspersoninahousingunitthatwascodedasageocodingerrorwasanerroneousenumerationunlessthe housingunitwaslocatedinsidetheexpandedsearcharea.Thecensusgeocodingerrorswereidentifiedinthehous-ingunitphaseoftheA.C.E.Anotherinterviewidentified thehousingunitsthatphysicallyexistedinthesurround-ingblocks,insteadofwithintheclusterwheretheywereenumerated.Thisfieldworkwasdoneforwholehouse-holdE-samplenonmatchesinhousingunitsidentifieddur-ingthehousingunitphaseasgeocodingerrors.ThisfieldvisitwasconductedataboutthesametimeastheA.C.E.personinterview.Thepeopleinthesehousingunitswerecodedasfollows: | |||
*Ifthehousingunitwasfoundtoexistinthesurround-ingblocks,theclerkscodedtheE-samplepersonas geocodedtothesurroundingblocksduringthebeforefollow-uppersonmatching.*Ifthehousingunitexistedinthesamplecluster,theE-samplepersonwascodedasnotageocodingerror, becausethathousingunitdidexistinthesamplecluster.*Ifthehousingunitdidnotexistinthesurroundingblocksorcouldnotbelocatedonthemapsentwiththe case,theE-samplepersonwascodedasageocodingerror,indicatingthepersonwaserroneouslyenumer-atedbecausethehousingunitwasincorrectlygeocoded intheblockcluster.SectionIChapter44-13A.C.E.FieldandProcessingActivitiesU.S.CensusBureau,Census2000 | |||
*Ifthefieldworkwasnotdoneorifitcouldnotbedeter-minediftheblocknumberenteredontheformwasin theblockclusterorinthesurroundingblocks,theunre-solvedcodewasused.Therewasnofollow-upforthe unresolvedcases.Apersonfollow-upinterviewfortheE-samplenonmatchescodedinthesampleclusterorinthesurroundingblockswasneededtoidentifyotherreasonsforerroneousenu-meration,suchasfictitiouspeopleandotherresidenceswherepeopleshouldhavebeencountedonCensusDay.E-SampleTargetedDuplicateSearchAsearchforduplicatedpeoplewasconductedclericallyinthetargetedextendedsearchclusters,whenthehousingunitwasidentifiedduringthefieldinterviewasphysically existinginthesurroundingblocks.LiketheP-samplesearchformissedunits,theduplicatesearchwascreatedtoidentifypeoplewhowereduplicatedbecauseofgeo-codingerror.Therewasnosearchingforduplicatesinthegroupquartersenumerations.IfanE-samplehousingunitwasidentifiedasexistinginthesurroundingblocks,ahousingunitduplicatesearchwasconducted.Howthiswasdonedependedonwhethertheclusteranditssurroundingblocksconsistedsolelyof urbanstyleaddressesorwhethertheyweresomeorallruralstyleaddresses.*Inurbanareas,thisduplicatesearchwasdonefirstonhousingunitsandthenonpeople.First,theclerkssearchedintheblockwherethehousingunitshouldhavebeencountedintheringofsurroundingblocks.If thehousingunitwasduplicated,asearchwascon-ductedtoidentifyduplicatedpeople.Theduplicatesearchwasconductedonlyintheblockwherethedupli-catedhousingunitwaslocated.Thesepeoplewere duplicatedbecausethehousingunitwasenumeratedcorrectlyinasurroundingblockandincorrectlyinthesamplecluster.Ifthehousingunitwasnotduplicated,a searchforpersonduplicationwasnotconducted.Thesearchconcentratedonpeoplewhowereduplicatedandwereinduplicatedhousingunitscausedbyhousing unitgeocodingerrorinthesurroundingblocks.*Theduplicatesearchinruralormixedareaswasasearchthroughouttheentiresearchareaforperson | |||
duplicates.AddedandDeletedCensusHousingUnitsCensuscoverageoperationscontinuedpastthecreationoftheJanuary,2000DMAF.Asaresult,anaddedcensushousingunitisonethatwasnotintheinitialhousingunitmatching,becauseitwasaddedtotheinventoryofcensus housingunitsaftertheJanuary,2000DMAFwascreated.A deletedcensushousingunitisonethatwasintheJanuary,2000DMAF,butwasremovedfromtheclusterbeforethefinalinventoryofhousingunitswascreated.ThetargetedextendedsearchwasbasedontheA.C.E.housingunitmatchingtotheJanuary,2000DMAFanddid notcovercensushousingunitsaddedtotheblockcluster sincehousingunitmatching,thusexcludinganygeocod-ingerrorsthatwerenotrecognizedintimetoconductthe TESfieldfollow-up.Ifaclusterwasnotidentifiedfortar-getedextendedsearchandalargebuildingwasaddedto thecluster,thefirsttimeitcouldhavecometoouratten-tionwasduringpersonmatchingandanyaddedhousing unitswouldbeidentifiedasgeocodingerrorsduringthe personfollow-up.Ifanyofthesecasesshouldhavebeen includedinthetargetedextendedsearchandwereincor-rectlygeocoded,anotherfollow-upoperationwouldhave beenneededtoidentifytheonesthatactuallyexistedin thesurroundingblocksandthosethatexistedoutsidethe expandedsearcharea.Therewasnotsufficienttimetoconductanotherinterviewtodeterminewhichaddedcensushousingunitswith geocodingerrorreallyexistedinthefirstringofsurround-ingblocks.Thesecaseswerehandledintwoways:*InTESclustersandclusterseligibleforTESsampling,thepeopleinaddedhousingunitswherepersonfollow-upidentifiedgeocodingerrorweretreatedas unresolvedandtheprobabilityofcorrectenumeration wasimputed.Thesenewunresolvedcasesweretreatedthesameasanyotherpersoncodedwithunresolvedgeography.*WhenthehousingunitwasnotinaTEScluster,thepeopleremainedcodedasgeocodingerrorsandwereerroneousenumerations.Asimilarlimitationexistedwhenahousingunitthatwasmatchedinthehousingunitmatchingwaslaterdeleted. | |||
Therewasaconcernthatthedeletedunitmayhavebeenmovedtoasurroundingblock.Clusters,wherematchedhousingunitsintheDMAFthatweredeletedfromthe HCUF,hadnochanceofbeingTESclusters,iftheclusterhadnoA.C.E.housingunitnonmatchesorcensusgeo-codingerrors.Thesedeletedcaseswerealsotreateddifferentlydepend-ingonwhethertheywereinTESclusters:*IfinaTEScluster,theywereidentifiedasTESpeopleandasurroundingblocksearchwasconductedforthehousingunitsintheTESP-samplematching.*IfthehousingunitwasnotinaTEScluster,therewasnosurroundingblockmatching.Surroundingblock matchingcouldnotbedonebecausetherewerenosur-roundingblockpeopleinnon-TESclusters.BeforeFollow-UpResultsTables4-10and4-11containtheresultsofbeforefollow-upmatchingforthePsampleandtheEsample.Fordetailsofthesecodes,seeChilders(2001).Thesebefore4-14SectionIChapter4A.C.E.FieldandProcessingActivitiesU.S.CensusBureau,Census2000 follow-upmatchingresultsarefromunweighteddatafromthefiftystatesandtheDistrictofColumbia.TheP-samplecodesaregroupedinto:*Matched*Notmatched*Possiblematch*Unresolvedmatchstatus*RemovedfromthePsample Matched.TheP-samplepersonwasfoundinthecensus.Notmatched.TheP-samplepersonwasnotfoundinthecensus.Afollow-upinterviewwasconductedfor:*Partialhouseholdnonmatches | |||
*Wholehouseholdsofconflictinghouseholdmembers(i.e.,wholehouseholdsofP-sampleandcensusnon- | |||
matches)2*OtherwholehouseholdnonmatcheswheretheP-sampleinterviewwasconductedwithanonhouseholdmember 3Possiblematch.TheP-samplepersonmayhavebeenamatchtothecensusperson.Afollow-upinterviewwas neededtodetermineifthetwonamesreferredtothesameperson.Unresolvedmatchstatus.Theonlycategoryofunre-solvedinthebeforefollow-upmatchingwasinsufficientinformationformatchingandfollow-up.RemovedfromthePsample.TheonlycategoryofremovedfromthePsampleinthebeforefollow-upmatch-ingweretheP-samplepeoplecodedasduplicates.TheE-samplecodesaregroupedinto: | |||
*Correctlyenumerated*Erroneouslyenumerated*Nonmatch | |||
*Possiblematch*UnresolvedCorrectlyenumerated.Thecorrectlyenumeratedpeopleinbeforefollow-upmatchingweretheonesmatch-ingthePsample.Erroneouslyenumerated.Thecategoriesduringbeforefollow-upwerefictitiouspeople,duplicates,insufficientinformationformatchingandfollow-up,andgeocoding errors.*ThefictitiouspeoplewerethosewherenotesonthecensusimageidentifiedthepersonasonewhodiedbeforeorwasbornafterCensusDay,orasnotarealpersonsuchasadogorotherpet.*TheE-samplepeopleenumeratedmorethanoncewerecodedasduplicates.*TheE-samplepeoplewithinsufficientinformationformatchingandfollow-upwerethosewhoweredata-defined,butdidnotcontainfullnameandatleasttwo characteristics. | |||
4*Censuspeopleinhousingunitsidentifiedasgeocodingerrors 5duringtheinitialhousingunitfollow-upwerecodedaserroneouslyenumeratedbecauseofgeocoding error.Nonmatch.AllE-samplepeoplewhodidnotmatchtothePsampleweresentforafollow-upinterview.Possiblematch.E-samplepeoplewhowerecodedaspossiblematcheswerefolloweduptodeterminewhethertheywere,infact,matches. | |||
Unresolved.Inbeforefollow-upmatching,theunre-solvedcategoryonlyincludesthecensushousingunitsthatneededtargetedextendedsearchfieldworkandthatfieldworkwasnotdone.Table4-10.PSampleBeforeFollow-Up MatchingP-samplematchstatus UnweightedpeoplePercent Matched..............................573,50685.7Notmatched | |||
..........................76,80411.5Possiblematch | |||
........................5,0700.8 Unresolved | |||
...........................7,5241.1 Removed.............................5,9230.9Total.................................668,827100.0 2ThesecaseshavebeencalledtheSmith/Jonescasesinthe past.3Nofollow-upinterviewwasconductedwhentherewerewholehouseholdsofP-samplenonmatchesfrominterviewswith householdmembersinahousingunitthatdidnotmatchinthe housingunitoperationormatchedtoahousingunitcontainingnodata-definedpeople. | |||
4Thisisthesamerulethatwasusedinthe1990PES.Theremusthavebeenenoughinformationaboutthepersontohavea chanceatlocatingthepersonforafollow-upinterviewbeforethe personwasallowedintothematchingprocess.SeeChilders | |||
(2001).5Ageocodingerrorisanerrorinassigningthehousingunittothecorrectlocation.SectionIChapter44-15A.C.E.FieldandProcessingActivitiesU.S.CensusBureau,Census2000 Table4-11.E-sampleBeforeFollow-Up MatchingE-sampleenumerationstatus UnweightedpeoplePercentCorrectlyenumerated | |||
..................544,99576.4Erroneouslyenumerated. | |||
..............27,9343.9Notmatched | |||
..........................134,91618.9Possiblematch... | |||
.....................4,7510.7 Unresolved...........................3040.0Total.................................712,900100.0Note:Percentagesintablemaynotaddtototalduetorounding.A.C.E.PersonFollow-UpThepersonfollow-upwasconductedtogatheradditionalinformationtoaccuratelycodetheresidencestatusofthe nonmatchedP-samplepeopleandtheenumerationstatusoftheE-samplepeople.Inaddition,thematchstatusofthepossiblematcheswasresolvedduringthefollow-up interview.Thefollowingcasesweresenttopersonfollow- | |||
up:*P-samplepartialhouseholdnonmatches*P-samplewholehouseholdnonmatcheswherethecen-susenumerateddifferentE-samplepeople(i.e.,conflict-inghouseholdsorSmith/Jonescases)*P-samplewholehouseholdnonmatcheswheretheA.C.E.personinterviewwaswithaproxyrespondent*E-samplenonmatches*PossiblematchesbetweenthePsampleandthecensus*P-samplematchesandnonmatcheswithunresolvedresidencestatus*P-samplenonmatchesneedingadditionalgeographic work 6Theresultsofthefollow-upinterviewwererecordedinthe matchingsoftwarebythematchingclerks.Table4-12con-tainstheresultsofthefollow-upcodingfortheP-samplepeoplewhowerefollowedup.TheP-samplepeoplewhowerefollowedupwereclericallyclassifiedas:*Matched*NonmatchedresidentoftheclusteronCensusDay | |||
*Unresolvedresidenceormatchstatus | |||
*NonresidentoftheclusteronCensusDay Matched.TheP-samplepersonwasfoundinthecensusintheblockclusterorinasurroundingblockafterthe follow-upinterview.NonmatchedresidentoftheclusteronCensusDay.TheP-samplenonmatchwasnotfoundinthecensus,andthefollow-upinterviewdeterminedheorshe shouldhavebeencountedinthesearchareaforthiscluster.Unresolvedresidenceormatchstatus.Thepersonhadunresolvedresidencestatus,becausethefollow-upinterviewdidnotsuccessfullycollecttheinformationrequiredtoaccuratelyidentifythispersonasaresidentof theclusteronCensusDay.Inthecaseofpossiblematches,thefollow-upinterviewwasnotabletoascertainthematchstatusofthepeople.NonresidentoftheclusteronCensusDay. | |||
TheP-samplepersonwasnotaresidentofthehousinguniton CensusDayandwasremovedfromthePsample.These peoplewereduplicates,fictitious,livinginaP-samplehousingunitthatwaslistedintheclusterinerror(i.e.,P-samplegeocodingerror),ortheP-samplepersonshould havebeencountedatanotherresidenceonCensusDay.Theresultsofthefollow-upinterviewinTable4-12indi-cate14.7percentunresolvedand12.5percentremoved fromthePsample.Table4-12.ResultsofP-sampleFollow-Up InterviewAfterfollow-upmatchcode UnweightedpeoplePercent Matched..............................9,79319.4Nonmatchedresident | |||
..................26,96153.4 Unresolved | |||
...........................7,45114.7 Nonresident | |||
..........................6,29612.5Total.................................50,501100.0Table4-13containstheresultsoftheE-samplefollow-upinterviews.Thefollowed-upE-samplepeoplewereclassi-fiedas:*Matched*Correctlyenumerated*Erroneouslyenumerated*Unresolved Matched.TheP-sampleandE-sampleenumerationsrefertothesameperson.Thematchwasmadeafterthefollow-upinterview.Correctlyenumerated.TheE-samplenonmatchwasidentifiedduringthefollow-upinterviewascorrectlyenu-meratedinthecensus. | |||
6Housingunitsinrelistandlist/enumerateclustersdidnothavehousingunitmatching.Therefore,P-samplegeocoding errorsinsuchclustersneededtobeidentifiedduringperson matching.Inaddition,whentheinterviewerchangedtheaddressintheCAPIinstrument,theP-samplegeographywascheckedtomakesuretheinterviewerdidnotinterviewoutsidethesample cluster.4-16SectionIChapter4A.C.E.FieldandProcessingActivitiesU.S.CensusBureau,Census2000 Erroneouslyenumerated.TheE-samplenonmatchwasidentifiedduringthefollow-upinterviewaserroneously enumeratedinthecensus,becausethepersonshould havebeencountedatanotherresidenceonCensusDay, wasfictitious,hadinsufficientinformationformatching andfollow-up,wasduplicated,orlivedinahouseholdthat wasageocodingerror. | |||
Unresolved.Thefollow-upinterviewforthecensusnon-matchwasnotsuccessful.TheresultsoftheE-samplefollow-upinTable4-13indi-cate7.4percentoftheE-samplepeoplefollowedupwere erroneouslyenumeratedand14.1percentwereunre-solved.Table4-13.ResultsofE-sampleFollow-UpforNonmatchesandPossibleMatchesAfterfollow-upmatchcode UnweightedpeoplePercentMatched... | |||
...........................9,0886.3Correctlyenumerated | |||
..................103,58972.2Erroneouslyenumerated. | |||
..............10,6187.4 Unresolved | |||
...........................20,18514.1Total.................................143,480100.0AfterFollow-UpCodingAfterthefollow-upwascompleted,theresultsoftheinter-viewswerereviewedandcodesenteredintothesystembythematchingclerks.SeeAttachments5,6,and7for definitionsoftheindividualmatch,enumeration,andresi-dencestatuscodesassignedbythematchingclerks.ThefinalP-sampleresultsareshowninTables4-14and4-15.TheP-samplepeoplehavebeenclassifiedas matched,notmatched,unresolvedmatchstatus,andremovedinTable4-14andalsotabulatedasresident,non-resident,andunresolvedresidencestatusinTable4-15. | |||
Thedataareunweighted,butthepeoplesampledoutof thetargetedextendedsearchareremovedfromtabula-tionsforthissection.TheP-samplematchstatusisdefinedas:*Matched*Notmatched | |||
*Unresolvedmatchstatus | |||
*RemovedfromthePsample Matched.TheP-samplepersonwasfoundintheclusterorinthesurroundingblockineitherahousingunitorin groupquarters.Notmatched.TheP-samplepersonwasnotfoundinthesearcharea.Ifthenonmatchwassenttofollow-up,thepersonwasconfirmedtobearesidentoftheclusteronCensusDay.Ifthenonmatchwasnotsentforafollow-upinterview,ahouseholdmemberidentifiedthepersonasa residentofthehousingunitduringtheoriginalA.C.E. | |||
interview.Unresolvedmatchstatus.Thematchstatuswasunre-solvedforpossiblematcheswithunsuccessfulfollow-upinterviewsandforP-samplepeoplewithinsufficientinfor-mationformatchingandfollow-up.RemovedfromthePsample.PeoplewereremovedfromthePsamplewhentheywerefictitious,duplicates,geocodingerrors,ornotresidentsofthehousingunitonCensusDay.Table4-14.P-sampleMatchStatusAfter Follow-UpP-sampleafterfollow-upmatchstatus UnweightedpeoplePercent Matched..............................578,69588.6Notmatched | |||
..........................54,4248.3 Unresolved | |||
...........................7,8261.2 Removed.............................12,3931.9Total.................................653,338100.0TheP-sampleresidencestatuswasdefinedas:*Resident | |||
*Nonresident*Unresolvedresidencestatus Resident.TheP-samplematchedornotmatchedpersonwasaresidentofthehousingunitonCensusDay. | |||
Nonresident.P-samplepeoplewerenonresidentsoftheclusterwhentheywerefictitious,duplicates,geocodingerrors,orshouldnothavebeenincludedasaresidentof thehousingunitonCensusDay.Nonresidentswere removedfromthePsample.Unresolvedresidencestatus.AmatchedornotmatchedP-samplepersonhadunresolvedresidencestatus whenthefollow-upinterviewdidnotsuccessfullydeter-minethepersonsresidenceonCensusDay.Theresidencestatusofthepossiblematchwasunresolvedwhenthe follow-upinterviewwasnotsuccessful.Theresidencesta-tuswasalsounresolvedwhentheP-samplepersonhadinsufficientinformationformatching.Table4-15.P-sampleResidenceStatusAfter Follow-UpP-sampleafterfollow-upresidencestatus UnweightedpeoplePercentResident.............................625,86395.8 Nonresident | |||
..........................12,3931.9 Unresolved | |||
...........................15,0822.3Total.................................653,338100.0SectionIChapter44-17A.C.E.FieldandProcessingActivitiesU.S.CensusBureau,Census2000 ThefinalE-sampleresultsareinTable4-16.TheE-samplepeoplewereclassifiedascorrectlyorerroneouslyenumer-atedorhavinganenumerationstatusofunresolved. | |||
Theseweretheunweightedmatchresultsthatgotoimpu-tationandestimationwiththepeoplesampledoutofthe targetedextendedsearchremoved.TheE-sampleenumerationstatuswasdefinedas:*Correctlyenumerated | |||
*Erroneouslyenumerated*UnresolvedenumerationstatusCorrectlyenumerated.E-samplepeoplewerecorrectlyenumeratedwhentheywerematchedtothePsample,orwhentheyhavebeenfollowedupandtheyshouldhave beenenumeratedinthiscluster.Erroneouslyenumerated.E-samplepeoplewereerro-neouslyenumeratedwhentheyhaveanotherresidence wheretheyshouldhavebeencountedonCensusDay,werefictitious,wereduplicated,livedinahousingunitthatwasageocodingerror,orhadinsufficientinformation formatchingandfollow-up.Unresolvedenumerationstatus.E-samplepeoplehadunresolvedenumerationstatuswhenthefollow-upinter-viewwasunsuccessful.TheE-samplepersonmayhavebeenfolloweduptoobtaininformationabouttheE-samplenonmatch,possiblematch,matchedpersonwith unresolvedresidencestatus,orgeographicworktoobtainthelocationofthehousingunit.Table4-16.E-sampleMatchingAfterFollow-UpE-sampleenumerationstatus UnweightedpeoplePercentCorrectlyenumerated | |||
..................652,39092.6Erroneouslyenumerated. | |||
..............31,0644.4 Unresolved | |||
...........................21,1483.0Total.................................704,602100.0TherewereunresolvedcodesassignedtoP-sampleandE-samplepeople.AprobabilityofbeingmatchedwasimputedforaP-samplepersonwithunresolvedmatchsta-tus.AprobabilitythattheP-samplepersonwasaresident wasimputedwhenthefollow-updidnotgiveenoughinformationtoresolvethepersonsresidencestatus.TheprobabilitythataP-samplepersonwasaresidentwasthe probabilitythatthepersonshouldhavebeenincludedin theP-sample.TheprobabilitythattheE-samplepersonwascorrectlyenumeratedwasalsoimputedfortheE-samplepeoplewithunresolvedenumerationstatus.AP-samplepersoncouldbematched,buthaveunresolvedresidencestatusorhavebothmatchandresidencestatus unresolved.Therefore,tabulationsformatchstatusand residencestatusareshownseparatelyfortheP-sample.EstimationOutcomeCodesTwosetsofoutcomecodeswereprepared,onefortheCensusDayhouseholdandonefortheInterviewDay household.ThefinalP-sampleestimationoutcomecodeidentifiedthestatusoftheinterviewforestimationonCensusDayandonthedayoftheinterview.Forexample, therewerecasesthatwerecompleteinterviewsforthecurrentresidents,butwerereportedasnonintervieworvacantfortheCensusDayresidents.ThefinalCensusDayoutcomecodesareinTable4-17.Outcomecodeswerechangedasaresultofthefollow-up interviewinthefollowingtypesofsituations: | |||
*NoCensusDayresidentsnoninterview. | |||
WholehouseholdsofP-samplepeoplewhosaidtheylivedelse-whereonCensusDaywereconvertedtononinterviews. | |||
*NoCensusDayresidentsvacant.WholehouseholdswholivedingroupquartersonCensusDayorshouldhavebeenenumeratedatanotherresidencewerecon-vertedtovacant.Theoutcomecodesforthesetwosituationswerechangedbecausenewinformationfromthefollow-upinterviewindicatedtheoriginalinterviewwasincorrect.Thehousingunitoutcomecodeforpeopleidentifiedasresidentsofthe housingunitfromthepersoninterviewwhosaidinthe follow-upinterviewthattheylivedelsewherewaschangedtononinterview.Theoriginalpersoninterviewlistedthishouseholdasresidentsofthehousingunitwhentheydid notliveatthisaddress.Theinterviewisincorrectandisconvertedtoanoninterview.Thehousingunitoutcomecodesforpeopleidentifiedasresidentsofthehousingunit,fromthepersoninterviewwhosaidinthefollow-upinterviewthattheylivedin groupquartersorshouldhavebeenenumeratedatanotherresidence,werechangedtovacant.Theoriginalpersoninterviewshouldhaveclassifiedthehousingunit asvacant,becausethepeopleshouldhavebeenenumer-atedatanotheraddress.Thetablealsocontainsnumbersofhousingunitsidenti-fiedasinterviews,noninterviews,andvacantandpercent-agesoftotalhousingunitsandnumbersandpercentagesofoccupiedhousingunits.Thenoninterviewrateforoccu-piedhousingunitsforCensusDaywas3.0percent. | |||
AddressesthatwerenothousingunitsonCensusDaywereremovedfromthePsample.4-18SectionIChapter4A.C.E.FieldandProcessingActivitiesU.S.CensusBureau,Census2000 Table4-17.FinalCensusDayEstimationOutcomeCodesforA.C.E.HousingUnits(Unweighted)CensusDayoutcomecodeTotalhousingunitsOccupiedhousingunitsNumberPercentNumberPercentCensusDayinterview | |||
................................................254,17587.5254,17597.0CompleteCensusDayinterviewwithahouseholdmember | |||
.............. | |||
233,327CompleteCensusDayinterviewwithaproxyrespondent.. | |||
............. | |||
18,335Sufficientpartialinterview | |||
........................................... | |||
2,513CensusDaynoninterview. | |||
............................................7,7942.77,7943.0NoCensusDayresidents | |||
........................................... | |||
2,709FieldCensusDaynoninterview | |||
...................................... | |||
2,667Allpeoplehaveinsufficientinformationformatchingandfollow-up | |||
.......2,418TotaloccupiedCensusDayhousingunits | |||
..............................261,969100.0Vacant.............................................................28,4729.8NoCensusDayresidents | |||
........................................... | |||
4,561VacantonCensusDay | |||
.............................................23,911TotaloccupiedandvacanthousingunitsonCensusDay | |||
.................290,441100.0NotahousingunitonCensusDay | |||
..................................... | |||
10,472Totalhousingunits | |||
................................................... | |||
300,913TheCensusDaynoninterviewratesinTables4-18and4-19areforoccupiedhousingunits.Thepercentnoninter-viewwascalculatedfortheunweightednumbersofCen-susDaynoninterviewsdividedbytheoccupiedCensus Dayinterviews,whichwastheinterviewsplusthenonin-terviewsonCensusDay.TheCensusDaynoninterviewrateswererecalculatedtoreflectchangesduetocodinginafterfollow-upmatching.Table4-18.P-sampleNoninterviewRatesforCensusDayinOccupiedHousing UnitsbyInterviewModeInterviewmode Percent noninterviewTelephone......................................1.1 Personal.......................................3.7Qualityassurance | |||
............................... | |||
37.4Total...........................................3.0Table4-19.P-sampleNoninterviewRatesforCensusDayinOccupiedHousing UnitsbyTypeofInterviewTypeofinterview Percent noninterviewInterviewwithahouseholdmember...............1.8Proxyinterview | |||
................................. | |||
17.4Total...........................................3.0ComparisonofInitialandFinalP-SampleEstimationOutcomeCodesforCensusDayTable4-20comparesthepreliminaryandfinalCensusDayinterviewoutcomecodes.ThepreliminaryCensusDay outcomecodeswerechanged,whenthefollow-upinter-viewsfortheP-sampleclassifiedpeopleasnonresidents becausetheydidnotliveatthesampleaddressatthe timeofthecensus,ortheywereconsideredaslivingat thesampleaddressbutshouldhavebeencountedat anotherresidencesuchasgroupquartersoranother home.Thehousingunitcouldalsobeidentifiedasnot beingahousingunitonCensusDay.SectionIChapter44-19A.C.E.FieldandProcessingActivitiesU.S.CensusBureau,Census2000 Table4-20.ComparisonofthePreliminaryandFinalCensusDayOutcomeCodesPreliminaryCensusDayoutcomecodesFinalCensusDayoutcomecodes Interview with household member Inter-view with proxy Partial inter-viewNoCensus Day residents-noninterview Field noninterview Whole householdinsufficient informationNoCensus Day residents-vacantVacant Not a housing unitInterviewwithHouseholdmember | |||
....233,327002,033001250147Interviewwithproxy.................018,3350676002520117Partialinterview....................002,5130097002Fieldnoninterview..................00002,6670000Wholehouseholdinsufficientinformation........................000002,321000NoCensusDayresidents-vacant.....00 00004,18400Vacant............................00 0000023,9110Notahousingunit..................00000000 10,206Table4-21.FinalInterviewDayEstimationOutcomeCodesforA.C.E.HousingUnits(Unweighted)InterviewDayoutcomecodeTotalhousingunitsOccupiedhousingunitsNumberPercentNumberPercentInterviewDayinterview | |||
.....................................................264,10389.0264,10398.9CompleteinterviewonInterviewDaywithahouseholdmember. | |||
.............. | |||
249,854CompleteinterviewonInterviewDaywithaproxyrespondent | |||
................ | |||
12,317Sufficientpartialinterview | |||
................................................. | |||
1,932InterviewDaynoninterview | |||
.................................................3,0521.03,0521.1NoInterviewDayresidents-householdconvertedtononinterview..............483FieldnoninterviewonInterviewDay........................................373Allpeoplehaveinsufficientinformationformatchingandfollow-up | |||
............. | |||
2,196TotaloccupiedhousingunitsonInterviewDay | |||
.................................267,155100.0VacantonInterviewDay. | |||
...................................................29,66210.0TotaloccupiedandvacanthousingunitsonInterviewDay | |||
......................296,817100.0NotahousingunitonInterviewDay.. | |||
....................................... | |||
4,096Totalhousingunits | |||
......................................................... | |||
300,913FinalP-SampleEstimationOutcomeCodesforInterviewDayThefinalInterviewDayoutcomecodesareinTable4-21.Theinterviewoutcome,asofInterviewDay,wasforcasesoriginallyclassifiedasnonmoversandinmovers.Changesasaresultofthefollow-upinterviewwerefromwhole householdsofnonmoverswhosaidthey:*Neverlivedatthisresidence*LivedingroupquartersonCensusDay | |||
*LivedatanotherresidenceonCensusDayTheoutcomecodesforthesecaseswereconvertedto noninterviews.TheInterviewDaynoninterviewrateswererecalculatedtoreflectchangesduetocodinginafterfollow-upmatching. | |||
ThefinalnoninterviewratesforInterviewDaybyinter-viewmodeandtypeofinterviewareinTables4-22and 4-23.Table4-22.P-sampleNoninterviewRatesforInterviewDayinOccupiedHousing UnitsbyInterviewModeInterviewmode Percent noninterviewTelephone.................................0.7 Personal..................................1.0Qualityassurance | |||
.......................... | |||
15.4Total......................................1.1Table4-23.P-sampleNoninterviewRatesforInterviewDayinOccupiedHousing UnitsbyTypeofInterviewTypeofinterview Percent noninterviewInterviewwithahouseholdmember..........0.5Proxyinterview............................8.6Total......................................1.14-20SectionIChapter4A.C.E.FieldandProcessingActivitiesU.S.CensusBureau,Census2000 Attachment1.A.C.E.MoverandResidenceStatusCode | |||
*A.C.E.MoverCode1=Nonmover2=Inmover3=Outmover | |||
*A.C.E.BornSinceCensusDayCode0orblank=Defaultforinmovers1=BornonorbeforeCensusDay 2=BornsinceCensusDayD=DontknowR=Refused*A.C.E.GroupQuartersCode0orblank=Defaultforwholehouseholdinmovers 71=IngroupquartersonCensusDay2=NotingroupquartersonCensusDayD=Dontknow R=Refused*A.C.E.OtherResidenceCode0orblank=Defaultforwholehouseholdinmovers1=InotherresidenceonCensusDay2=NotinotherresidenceonCensusDayD=Dontknow R=Refused*A.C.E.StatusN=Nonmover,residentonCensusDayO=Outmover,residentonCensusDayI=Inmover,nonresidentonCensusDayR=Removed,nonresidentonCensusDay U=Unresolvedresidencestatus B=BornsinceCensusDay,nonresidentonCensus Day 7Partialhouseholdinmoverswereassignedthecodesof1,2,D,orRduringtheeditforCAPIdatareview.SectionIChapter44-21A.C.E.FieldandProcessingActivitiesU.S.CensusBureau,Census2000 Attachment2.TheTreatmentofGroupQuartersinA.C.E.TheA.C.E.wasdesignedtoprovideestimatesofpersoncoverageinhousingunits.Therewasnosampleof,and noestimatesfor,personsingroupquarters.TheP-sample housingunitswereselectedfortheA.C.E.andthepeopleintheP-samplehousingunitswerematchedtothepeopleenumeratedincensushousingunits.Classifyingastructureasgroupquarterswasdifficultattimes.Forexample,homesfortheelderlyhavemadeitmorecommonforasinglestructuretocontainapartments forretiredpeople,assistedliving,andfullcare.Anotherexamplewascollegedormitories.Adormitorywasgroupquarterswhenitwasoccupiedbyunmarriedstudents.Thedormitorycontainedhousingunitsifitwasoccupiedbymarriedstudents.Ifthedormitorywasmixedwithmar-ried,unmarried,faculty,andstaff,itcontainedhousingunits.Asaresult,housingunitsorgroupquarterscould bemisclassified,whentheywerenoteasilyclassifiedashousingunitsorgroupquarters.ThismisclassificationcouldbefoundinboththeA.C.E.andthecensus.WhentheP-samplepeopleinA.C.E.housingunitsdidnotmatchtopeopleenumeratedinhousingunitsinthecen-sus,theywerematchedtopeopleenumeratedinthecen-susingroupquarters.Thatis,groupquartersweresearchedforP-samplenonmatches.IftheP-samplepeoplewerefoundinthegroupquartersenumerations,theywere treatedasmatched.However,noattemptwasmadetodis-coverwhetherthemisclassificationwasintheA.C.E.orthecensus.Likewise,ifacensuspersonintheE-samplewasenumer-atedinahousingunit,butthehousingunitwasmisclassi-fiedandshouldhavebeengroupquarters,thefollow-upofthecensusnonmatchobtainedinformationaboutthe residenceoftheperson.Ifitfoundthepersonshouldhavebeencountedinthisblockingroupquartersorahousingunit,thepersonwascodedascorrectlyenumeratedin A.C.E.processing.Theidealwasnottoclassifysomeoneaserroneouswhentheyreallyshouldhavebeencountedinthiscluster,butthetypeofresidencewasmisclassified.Ifastructurecontainedbothhousingunitsandgroupquarters,thepeoplewhowereenumeratedinthecensusinahousingunitwereeligibletobeintheEsample.Thefollow-upinterviewidentifiedsuchE-samplepeoplewho werenotmatchedaslivingintheclusterandhavingnootherresidence.Theywerecodedascorrectlyenumer-ated.Therewasnoduplicatesearchbetweenpeopleenu-meratedingroupquartersandhousingunits.Insummary,then,theapproachwasbalanced:*LookforP-samplepeopleingroupquarterswhentheywerenotfoundincensushousingunits.*FollowupE-samplepeopleinbothhousingunitsandgroupquartersinthecluster.Thepopulationinhousingunitswascovered,buttherewasnoestimateofcoverageingroupquarters.Ifthehousingunitwasduplicatedinthegroupquarters,thegroupquarterspeoplewerenotcountedasduplicates. | |||
Likewise,ifagroupquarterwasmissed,therewasno determinationofundercountedinhabitants.4-22SectionIChapter4A.C.E.FieldandProcessingActivitiesU.S.CensusBureau,Census2000 Attachment3.TheA.C.E.PersonInterview 8HouseholdRosterIfthepersonlivedatthesampleaddress,theinterviewerbegantheinterviewwithaseriesofquestionstoobtainthenamesofeveryonecurrentlylivingatthesamplehous-ingunit.Thefirstquestionwas:Ineedtogetalistofeveryonelivingherepermanentlyorstayingtemporarilyatthisaddress.Whatisyour name?Afterobtainingthenameofthepersonwithwhomtheinterviewerwasspeaking,theinterviewerasked,Anyone else?Iftherewasayesresponse,theinterviewerasked,Whatishisorhername?andfollowedthatwithAnyoneelse?untiltheinterviewerreceivedanoresponse.Asacheckfortypesofpeoplewhowerefrequentlyleftofflistingsofhouseholdmembers,thereweretwoadditionalquestions.Thefirstquestionaskedaboutpeoplewhomay havelivedatthehouseholdsometimes,butnotallthe time,suchaschildreninjointcustodyorpeoplewhotrav-eledagreatdealofthetime.Thequestionwas:Arethereanyadditionalpeoplewhocurrentlyliveorstayhere,likesomeonewhostemporarilyawayorsomeonewhostayshereoffandon?Iftheresponsewasyestheinterviewerasked,Whatwashisorhername?andfollowedwithAnyoneelse?untiltheinterviewerreceivedanoresponse.Otherpersonswhowerefrequentlyomittedfromhouse-holdlistingswereroommatesorlive-inemployees.Theinterviewerasked,Isthereanyoneelselikearoommateoralive-inemployeewholiveshere?Iftheresponsewasyestheinterviewerasked,Whatishisorhername?andfollowedwithAnyoneelse?untiltheinterviewerreceivedanoresponse.Atthispointintheinterview,theinterviewerhadcollectedalistofhouseholdmembersthattherespondenthadvol-untarilymentioned,andtheinterviewerhadalsochecked fortwotypesofpersonsthatresearchhadshownwerefrequentlyleftoffhouseholdlistings.Theinterviewerthenreviewedascreenthatcontainedalistofthehouseholdmemberstherespondentreported.Theinterviewerreadthelistofnamesandaskedifthelistwascorrect.Theinterviewersaid,Ihavelisted[READSNAMESONSCREEN].Isthatcorrect?Aftertherespondenthadreviewedthenames,theinterviewercouldchangethespelling,oraddordeleteaname. | |||
MoversWhentherespondentagreedthatthelistwascorrect,theinterviewerhandedtherespondentacalendarcontainingthemonthsofMarch,April,andMayof2000thathadCensusDayclearlymarked.Atthispointintheinterview, thegoalwastobegindeterminingwhetherthepeoplelistedascurrentresidentswerealsoresidentsofthesamplehousingunitonCensusDayandifanyoneelse shouldhavebeenincludedasaCensusDayresident.The intervieweraskedifanyofthelistedpersons(currentresi-dents)hadmovedintothesamplehousingunitafterCen-susDay.Theinterviewersaidtotherespondent:Pleaselookatthiscalendar.DidanyofthepeopleIjustlistedmoveinto<sampleaddress>afterCensusDay, April1,2000?Iftheanswerwasyes,theinterviewerasked,WhomovedinafterApril1?Anypersonmentionedwascon-sideredanonresidentofthesamplehousingunitonCen-susDay.Ifeveryoneinthehouseholdwasmentioned,thenthewholehouseholdwasconsiderednonresidentson CensusDay.TheinterviewernowhadalistofcurrentresidentswhoalsolivedatthesamplehousingunitonCensusDay.It wasnecessarytodetermineiftherewasanyonelivingatthesamplehousingunitonCensusDaywhodidnotlivetherecurrently.Theinterviewerasked,Wasthereanyone elselivingorstayinghereonApril1,2000whohasmovedout?Iftheresponsewasyes,theinterviewerasked,Whatishisorhername?andAnyoneelse?until anoresponsewasreceived.TheinterviewernowhadalistofthenamesofeveryonetherespondenthadreportedlivingatthesamplehousingunitcurrentlyandonCensusDay.Theinterviewerthenestablishedareferenceperson(relationshipswillberela-tivetothisperson)byaskingwhoownsorrentsthe houseorapartment.Theinterviewerasked,Inwhosenameisthis(house/apartment)ownedorrented?Theintervieweralsoaskedwhetherthehousingunitwas ownedorrentedbysaying,Doyouownthis(house/apartment),rentit,orliveherewithoutpaymentofrent?8SeeKeeley(2000)fordetails.SectionIChapter44-23A.C.E.FieldandProcessingActivitiesU.S.CensusBureau,Census2000 DemographicsAtthispointintheinterview,theinterviewerbegantocol-lectdemographiccharacteristicsaboutalllistedpersons tofacilitatematchingthepersonscollectedinthisinter-viewtopersonslistedonthecensusquestionnaireforthe samplehousingunit.Demographiccharacteristicsarealso usedtocreatepost-strataindualsystemestimation.See Chapter7formoredetails.Thedemographiccharacteristicscollectedintheinterviewwere: 1.Sex.Theinterviewermayhaveenteredthesexofthepersonoraskedthequestionwhenindoubt.The questionwas,Is[NAME 9]maleorfemale? | |||
2.Age.Agewascollectedinaseriesofquestions.Theintervieweraskedfordateofbirth(Whatis[NAMES] | |||
dateofbirth?).Whenthedateofbirthwasenteredintheinstrument,theageofthepersonwascalculatedandtheinterviewerverifiestheagebysaying,So | |||
[NAME]wasabout[AGE]onApril1?Iftheagewas notcorrect,theinterviewerchangedthedateofbirthinthepreviousquestionandtheagewasthenrecalcu-lated.Iftherespondentdidnotknowthedateofbirth,thentheintervieweraskedthepersonsage.Theinter-viewerasked,Whatwas[NAMES]ageonApril1, 2000?3.Relationship.Relationshipwastothepersoninwhosenamethehouseorapartmentwasownedorrented(calledtheReferencePerson).Theinterviewer handedtherespondentacardcontainingrelationshipcategoriesandasked,Howis[NAME]relatedto[THEREFERENCEPERSON]?foreachperson. | |||
4.HispanicOrigin.Hispanicoriginwascollectedinaseriesofquestions.Thefirstquestionwas,IsanyoneofSpanish,Hispanic,orLatinoorigin?Iftheresponse wasyes,theinterviewerasked,Whois?followed byIsthereanyoneelseofSpanish,Hispanic,orLatinoorigin?untiltheresponsewasno.IfanyonewasmentionedasbeingofHispanicorigin,theinterviewerasked,Is[NAME]ofMexican,Puerto Rican,Cuban,orsomeotherSpanishorigin?foreach personmentioned. | |||
5.Race.Racewasalsocollectedinaseriesofquestions.Theinterviewerreferredtherespondenttothepartof thecardcontainingracialcategoriesandsaid,Im goingtoreadalistofracecategories.Pleasechoose oneormorecategoriesthatbestdescribe[NAMES] | |||
race.Iftherespondentsaid,AmericanIndianorAlaskaNative,theinterviewerasked,Whatis[NAMES]enrolledorprincipaltribe(s)?Theinterviewerrecordedasmanyresponsesasgiven.Iftherespondentsaid,Asian,theinterviewerasked,TowhatAsiangroupdid[NAME]belong?Is[NAME]AsianIndian,Chinese,Filipino,Japanese,Korean,Viet-namese,or,someotherAsiangroup?Theinterviewerrecordedasmanyresponsesasgiven.Iftherespondentsaid,PacificIslander,theinter-viewersaid,TowhatPacificIslandergroupdid[NAME]belong?Is[NAME]GuamanianorChamorro,Samoan,orsomeotherPacificIslandergroup?The interviewerrecordedasmanyresponsesasgiven.Atthispoint,theinterviewerhadalistofallreportedcur-rentandCensusDayresidentsandtheirdemographic characteristicsforuseinmatchingtheseresidentstoresidentsreportedonthecensusquestionnaireforthishousingunit.ForhouseholdsthatreportedmovingintothesamplehousingunitafterCensusDay,thisinformationwasverified.Theinterviewersaidtotherespondent:So,everyoneyoumentionedtodaymovedinto<sampleaddress>afterApril1,2000.Isthat correct?Iftheinformationwascorrect,theinterviewwascontin-uedbyaskingtherespondentifheorsheknewandhad informationabouttheresidentsofthesamplehousingunitwholivedthereonCensusDay.(Thispartoftheinter-viewwasdiscussedinthesectiononmovers.)ResidenceSectionForallhouseholdsinwhichatleastonememberlivedatthesamplehousingunitonCensusDay,theinterviewer continuedwithafewquestionsthatcheckedfortwotypes ofspeciallivingsituationsthatwerepotentialsourcesofduplicateenumerations.Respondentstendedtoforgetthathouseholdmembersmayhavebeenlivingorstaying ataplaceawayfromthesamplehousingunit.Thismayhavecausedsomepersonstobereportedmorethanonce,atthesamplehousingunitandagainatotherplaces wheretheymayhavelivedorstayed.ThefirstsituationthathadthepotentialtocauseduplicateenumerationswaswhenapersonmayhavelivedataplacethatwasnotaprivatehouseholdonCensusDay. | |||
SincetheCensusBureaudidspecialenumerationsat 9Thebracketscontainingname,age,andtheReferencePer-sonsnamewerefilledbytheinstrument.Whenspeakingtotherespondent,AreyouorotherappropriatefillersreplacedIs[NAME].4-24SectionIChapter4A.C.E.FieldandProcessingActivitiesU.S.CensusBureau,Census2000 placessuchascollegedorms,nursinghomes,prisons,andemergencyshelters,theinterviewerinquiredifany-onewasstayingatanyofthesetypesofplacesbysaying:Youranswerstothenextfewquestionshelpuscounteveryoneattherightplace.TheCensus Bureaudidaspecialcountatallplaceswheregroupsofpeoplestay.Examplesincludecollegedorms,nursinghomes,prisons,andemergencyshelters.On April1,2000,wereanyofthepeopleyoumentionedtodaystayingelsewhereatanyofthesetypesofplaces?10Iftheresponsewasyes,theinterviewerasked,Whostayedatoneofthesetypesofplaces?Thenextsituationthatcouldresultinaduplicateenu-merationwaswhenapersonmighthavehadanotherresi-dence.Theinterviewersaid:Somepeoplehavemorethanoneplacetolive.Examplesincludeasecondresidenceforwork,afriendsorrelativeshome,oravacationhome.OnApril1,2000,didanyofthepeopleyoumentioned todayhavearesidenceotherthan<sample address>?Iftheresponsewasyes,theinterviewerasked,Whohadanotherresidence?Foreachpersonmentionedashavinganotherresidence,theinterviewerasked,AsofApril1,did[NAME]spend mostofthetimeat<sampleaddress>orattheotherresi-dence?Iftheresponsewas,Idontknow,theinter-viewerasked:Whichofthefollowingcategories,mostaccuratelydescribestheamountoftime[NAME]staysattheotherresidence?Afewdaysofeachweek;entireweeksofeachmonth;monthsatatime;orsome otherperiodoftime.Iftherespondentstillwasnotsurewherethepersonspentmostofthetime,therewasaseriesofquestions designedtoassignanamountoftimespentatsomeotherresidence,suchas,Duringatypicalweek,did[NAME]spendmoredaysat<sampleaddress>oratthe otherresidence?orDuringatypicalmonth,did[NAME] | |||
spendmoreweeksat<sampleaddress>orattheother residence?Ifthesequestionsdidnothelptherespondentdecidewherethepersonspentmostofthetime,thepersonsresidencewasdeterminedbyasking:Was[NAME]stayingat<sampleaddress>ortheotherresidenceonApril1,2000?Atthispoint,theinterviewerhadareportedlistofcurrentandCensusDayresidentsofthesamplehousingunit developedthroughanextensivehouseholdlistingproce-dure.Theinterviewerhadobtainedthedemographicchar-acteristicsofthelistedpersons.Throughquestionson mobilityandotherpossibleresidencesithadbeendeter-mined:*whethereveryonelistedinthehouseholdcurrentlyshouldbeconsideredaCensusDayresidentofthe samplehousingunit*whetheranyonecurrentlyabsentfromthehouseholdshouldbeconsideredaCensusDayresident.ConclusionofInterviewTheinterviewernowwasreadytoconcludetheinterview.Beforeconcluding,therewasonelastcheckofthehouse-holdlisting.Thefirstname,middleinitial,lastname,sex,andageofeachpersonlistedasacurrentandCensusDayresidentwasshownonthescreen.Theinterviewer,again, showedtherespondentthecomputerscreenandasked,DoIhavethespelling,sex,andagecorrectforevery-one?Ifnot,correctionscouldbemadeatthisscreenand therespondentwasaskedtoverifyand/orchangetheinformationuntiltherespondentsaidthateverythingwascorrect.Theintervieweraskedtherespondentforhis/hertele-phonenumberbysaying,Incaseweneedtocontactyouagain,mayIpleasehaveyourtelephonenumber?then thankedtherespondentandconcludedtheinterviewbysaying,Thisconcludesourinterview.TheCensusBureauthanksyouforyourparticipation. | |||
10Aninterviewerhelpscreenwasavailablewithacompletelistofspecialenumerationplaces.SectionIChapter44-25A.C.E.FieldandProcessingActivitiesU.S.CensusBureau,Census2000 Attachment4.InsufficientInformationforMatchingandFollow-UpThecensuspersonrecordswerereviewedbothbycom-puterandclericallytoidentifypeoplewithinsufficientinformationformatchingandfollow-up.Onlypeoplewithsufficientinformationformatchingandfollow-upwere allowedtobeprocessedinthematchingandfollow-upinterviewingphasesofthepersonmatching.Thethreetypesofinsufficientinformationwere:*Thecensuspeoplewerenotdata-defined.*Thecensuspeopleweredata-defined,butcomputercodedasinsufficientinformationformatchingand follow-up.*Thecensuspeoplewerecomputercodedassufficientinformation,butconvertedclericallytoinsufficientinformationformatchingandfollow-up.Thefirsttypeofcensuspeoplewhowerenotdata-definedwerenotincludedintheEsample.Onlydata-definedpeoplewereincludedintheEsample.Thesedata-definedpeoplecreatepersonrecordsinthecensus.CensusData-DefinedThetermdata-definedwasatermthathasbeenusedinthepastattheCensusBureautomeanthatacensusper-sonrecordhasbeencreated.ThetermTotalPersonswasthetotalnumberofpeoplecountedinthecensusatacen-sushousingunit.ThetermSelectedPersonsreferredto data-definedcensuspeopleinacensushousehold.Thedifferencewaspeoplewhowerenotdata-defined.Thesepeoplehadnocensuspersonrecord.Awholeperson imputationprocedurewasemployedtocreatecharacteris-ticdatainthecensusforthesepeople.Twocharacteristicswererequiredtobedata-defined,wherenamecountsasacharacteristic.Namemusthave hadatleastthreecharactersinthefirstandlastnametogether.Othercharacteristicsthatcouldbeusedinthecountingwererelationship,sex,race,Hispanicorigin,andeitherageoryearofbirth 11.Censusrecordswerecre-atedontheHCUFforalldata-definedpeople.Anyonewhowasnotdata-definedwasawholepersonimputation.Thecountofcensuspeoplewhowerewholepersonimpu-tationswereidentifiedseparatelyfromtheothercensuspeoplewithinsufficientinformationformatching,becausetheyweretreateddifferentlyintheDualSystemEstimator. | |||
Thenumberofwholepersonimputationswassubtractedfromthecensuscountwithinpost-strata.TheE-samplepeoplewhoweredata-definedbutwithinsufficientinfor-mationformatchingwereincludedinthecountoferrone-ousenumerations,andwere,thus,excludedfromthecountofwholepersonimputationsintheDualSystem Estimator.Themailreturncensusformsweredesignedtocollectcharacteristicsforsixpeople.However,spacewaspro-videdforthenamesoftheadditionalresidentsinhouse-holdswithseventotwelvepeople.Thelargehouseholdfollow-upoperationattemptedtoobtaincharacteristicsforthesepeoplebytelephone.Theexceptionwastheenumeratorquestionnaireusedinnonresponsefollow-up.Therewasspaceforfivepeople,butacontinuationformwasusedtorecorddataforper-sonssixandaboveinlargehouseholds.Therewassomeconsiderationgiventousingthenamesinthelongformrosterforpersonsseventhroughtwelveto createpersonrecordsandhavingthemdata-defined.How-ever,itseemedpreferablenottodothis,andtheA.C.E.didnotattempttocreateadditionalcensusdata-defined 11Persononedidnotautomaticallyhavearelationshipofheadofhouseholdlikeitdidin1990,andthetelephonenumberin item2,onthemailreturnquestionnaire,didnotcountasachar-acteristic.Theageanddateofbirthwereexaminedtogether.Ifagewaspresent,age/yearofbirthcountedasacharacteristic.Ifagewasblank,butyearofbirthwaspresent,thentheage/year ofbirthcountedasacharacteristic.Ifageandyearofbirthwere bothblank,theage/dateofbirthdidnotcountasacharacteristic. | |||
ThemonthanddayofbirthwereusedinDressRehearsalinthedeterminationofcountingtheage/dateofbirthasacharacteris-tic,butnotinCensus2000.4-26SectionIChapter4A.C.E.FieldandProcessingActivitiesU.S.CensusBureau,Census2000 peopleforthesepeoplewithonlynamesinlargehouse-holds.Thesepeoplewerewholepersonimputations. | |||
ThenumberofwholepersonimputationsusedintheDual SystemEstimatorwillcorrespondtothecountsusedin thecensus.ComputerCodingofInsufficientInformationforMatchingandFollow-UpfortheESampleTheA.C.E.requiresaminimumamountofinformationformatchingandfollow-up.Thedata-definedcensuspeoplewerereviewedtoidentifytheoneswithsufficientinforma-tionformatchingandfollow-upforA.C.E.Theminimum amountofdatarequiredfordata-definedcensuspeopletohavesufficientinformationformatchingandfollow-upwascompletenameandtwocharacteristics.Completenamewasdefinedas:*Firstname 12,middleinitial,andlastname*Firstnameandlastname*Firstinitial,middleinitial,andlastname TheA.C.E.usedthesamecriteriaforclassifyingageasdata-definedasthecensus,whichisonlyageandyearofbirthwereusedtodetermineifagewaspresentincount-ingcharacteristicstodetermineifthepersonhadenoughdatatobedata-definedinthecensus.Inotherwords,whentheageandyearofbirthwerebothblank,month anddayofbirthwerenotconsidered.ClericalCodingofInsufficientInformationforMatchingandFollow-UpfortheESampleTherewerecaseswherethenamewasnotblank,butwastooincompleteorunlikelytoberealtopermitmatchingandfollow-up.CensusnameslikeMr.Doe,DonaldDuck, andWhiteFemalewerecodedinsufficientinformationby theclericalmatchers.Thecomputercouldnotrecognizenamesthatwerenotrealorwerereallyincompletenames.Theretrievalsystemcontainedanimageofthecensusquestionnaire.Theimageofthecensusquestionnairewasreviewedforcensuspeoplecodedasinsufficientinforma-tionformatchingandfollow-uptoseeiftherewasaddi-tionaldatathatcouldbeusedtoconvertthemtosuffi-cientinformationformatchingandfollow-up.Thedatacapturesystemmayhavehadproblemsreadingthehand writtenentries,ortheremaybeinformationoutsidetheboxesonthecensusquestionnaire.Nameswereobtainedfromtherosterontheimageofthequestionnaireforthe longforms.Childrenwithfirstnamesandnolastnames wereconvertedtosufficientinformationformatchingand follow-upwhenthelastnamecouldbeassumedfroman adultwithfirstandlastnameinthehousehold.These updatestothenameswerecapturedintothematching software,whichwasprogrammedtodecideiftheperson hadsufficientinformationformatchingandfollow-up.P-SampleInsufficientInformationforMatchingandFollow-UpTheP-samplepeoplewerereviewedbycomputertoiden-tifypeoplewithinsufficientinformationformatchingandfollow-up.TheP-samplerulesforsufficientinformationfor matchingandfollow-upwerethesameastheE-samplerules,whichwascompletenameandtwocharacteristics.Casesidentifiedbythecomputerasmissingsufficient informationweresuppressedfromviewingbytheclerical matcherstopreventerrorsinmatchingpeoplewithinsuf-ficientinformationformatching.Therewerefewerthan4,000P-samplepeoplecomputercodedwithinsufficient informationformatchingandfollow-up.Thiscomputerreviewwasestablishedtoavoidcertaintypesofclericalerrorsinmatching.Forexample,nameslikeDKorDontKnow(DorDontisthefirstnameandKorKnowisthelastname),RR(refusedforthefirstnameandrefusedforthelastname),orMSmith,whichcouldnotbematchedwithcertaintyor,iftreatedasanonmatch,followedupwithahighrateofsuccess.Thecensusmight haverecordedapersonwithacompletename,which mightbematchedbyaclerk.Ifmatchingwereallowed,itwouldhavebeenbiasedbywhatwasenumeratedinthecensus.Amatchwouldhaveresultedifthenameswere presentattheaddress,andanonmatchifthenameswerenotinthecensus.SincenameslikeDKcouldnotbefol-lowedup,theywouldhavebeencodedasinsufficient informationformatchingandfollow-up.Therefore,a matchwouldhavebeenassignedwhenthecensusobtainedcompletenames,andunresolvedwhennomatchwasfound.Thebestwaytoavoidabiaswastosuppress theP-samplecasescomputercodedasinsufficientinfor-mationformatchingandtreatthemasunresolved.TheprobabilityofamatchwasimputedfortheP-samplepeoplecodedasinsufficientinformationformatchingandfollow-up.TheyweretreatedinthesamewayasotherP-samplepeoplewithunresolvedmatchstatus.Ifthe wholehouseholdhadinsufficientinformationformatch-ingandfollow-up,thepeoplewereremovedandcon-vertedtononinterviewstatus. | |||
12Theminimumnumberofcharacterstobeanamewastwo.Twocharacterswererequiredinthefirstnameandtwocharac-tersinthelastname.SectionIChapter44-27A.C.E.FieldandProcessingActivitiesU.S.CensusBureau,Census2000 Attachment5.FinalPSamplePersonMatchCodes MatchedM=TheP-sampleandcensuspeoplewerematched.TheP-samplepersonwasaresi-dentofthehousingunitonCensusDay.MR=Thefollow-upinterviewdeterminedthatthematchedpersonwithunresolvedresi-dencestatuswasaresident.MU=TheA.C.E.personwasmatched,butthefollow-upinterviewobtainednousefulinformationtoresolvetheresidencestatusforthematchedpersonwhohada residencestatusofunresolvedbeforefollow-up.TheP-samplepersonsresidencestatuswasunresolved.NotMatchedNP=TheP-samplepersonwasnotmatchedtoacensusperson.Therewasnofollow-upforthewholehouseholdnonmatchesfrompersoninterviewswithhousehold membersandthewholehouseholdnonmatcheswerenotconflictinghousehold | |||
nonmatches.NC=TheP-samplenonmatchwasfoundonthecensusroster.Thispersoninapartialnonmatchhouseholdwasnotmatchedtothecensusbecauseonlynamewascol-lectedinthecensusforthispersoninalargehouseholdandthecensuspersonwasnotdata-defined.Nofollow-upinterviewwasnecessary.NR=TheP-samplepersonwasnotmatchedandwasidentifiedasaresidentintheblockclusteronCensusDayduringtheA.C.E.personfollow-upinterview.NU=TheP-samplepersonwasnotmatched.NotenoughinformationwascollectedduringtheA.C.E.personfollow-upinterviewtoidentifytheP-samplepersonasa residentornonresidentintheblockcluster.TheresidencestatusfortheP-sample personwasunresolved.ThiscodewasalsousedwhentheP-samplepersonwas followeduptocollectgeographicinformationandthatinformationwasnotcol-lected.TheNUcodewasalsousedwhenthepersondidnotliveatthesample addressonCensusDayandtheCensusDayaddresswasnotcompleteenoughto determineiftheCensusDayaddresswasinthesamplecluster. | |||
UnresolvedP=Therewasnotenoughinformationcollectedduringthefollow-upinterviewtodetermineifthepossiblematchwasamatchornot.Thematchstatusofthe P-samplepersonwasunresolved.KI=MatchnotattemptedfortheP-sampleperson,becausethepersonhadinsufficientinformationformatchingandfollow-up.Thenamewasblankorincompleteorthe namewascomplete,butthepersonhadonlyonecharacteristic.Thiswasacom-puterassignedcodeandthesepeopleweresuppressedfromviewbythematch-ers.KP=MatchnotattemptedfortheP-sampleperson,because(1)thenamewasincom-plete,suchasMr.Jones,or(2)thenamewasnotavalidname,suchasWhite FemaleorDonaldDuck.Thiswasaclericallyassignedcode.4-28SectionIChapter4A.C.E.FieldandProcessingActivitiesU.S.CensusBureau,Census2000 RemovedfromthePSampleFP=TheP-samplepersonwasfictitiousinthisblockcluster.Thepersonwasinter-viewedinerrorduringthepersoninterview.Thispersonwasnotincludedinthe finalPsample.NL=TheP-samplepersondidnotliveatthesampleaddressorintheblockclusteronCensusDayandwaslistedasanonmoveroroutmoverinerror.Thispersonwas removedfromthelistofP-samplepeople,sinceheorshewascollectedduringthe personinterviewinerror.NN=TheP-samplepersonwasidentifiedasanonresidentintheblockclusteronCen-susDayduringtheA.C.E.personfollow-upinterview,becausethepersonlivedingroupquartersonCensusDay,orhadanotherresidencewherethepersonshould havebeencountedonCensusDayaccordingtocensusresidencerules.Thisper-sonwasremovedfromthelistofP-samplepeople,sinceheorshewascollected duringthepersoninterviewinerror.DP=TheP-samplepersonwasaduplicateofanotherP-sampleperson. | |||
MN=TheA.C.E.personfollow-upinterviewdeterminedthatthematchedpersonwithunresolvedresidencestatuswasnotaresidentinthishousingunitorinthisblock cluster.ThepersonwasnolongerinthelistofP-samplepeople.GP=TheP-samplepersonwasremoved,becausethepersoninterviewwasconductedatahousingunitthatexistsoutsidethesamplecluster.Thepersonfollow-upiden-tifiedthishousingunitasaP-samplegeocodingerror.SectionIChapter44-29A.C.E.FieldandProcessingActivitiesU.S.CensusBureau,Census2000 Attachment6.E-SamplePersonEnumerationCodes Correctly EnumeratedM=TheP-sampleandE-samplepeoplewerematched.TheE-samplepersonwascor-rectlyenumerated.CE=TheE-samplenonmatchwasidentifiedascorrectlyenumeratedduringtheA.C.E.personfollow-upinterview.MR=TheA.C.E.personfollow-upinterviewdeterminedthatthematchedpersonwithunresolvedresidencestatuswasaresident. | |||
Erroneously Enumerated 13GE=TheE-samplepersonwaserroneouslyenumeratedinthisblockcluster,becausethecensushousingunitwasageocodingerror(i.e.,countedinthewrongblock cluster).TheE-samplepersonshouldhavebeenenumeratedelsewhereinthe | |||
census.EE=TheE-samplenonmatchwasidentifiedduringthepersonfollow-upinterviewaserroneouslyenumerated.FE=TheE-samplenonmatchwasdeterminedtobefictitiousinthisblockclusterduringthefollow-upinterview.Thepersonmayhaveexisted,butshouldnothavebeenenumeratedinthecensuswithinthisblockcluster.TheE-samplepersonwaserro-neouslyenumeratedinthecensusinthisblockcluster.DE=TheE-samplepersonwasaduplicateofanotherE-sampleperson.ThecodewasalsousedwhentheE-samplepersonwasaduplicateofacensuspersoninasur-roundingblock.ThepeopleintheE-samplehousingunitwereerroneouslyenu-merated,becausetheywerecountedaccuratelyinthesurroundingblockand duplicatedinthesamplecluster.MN=TheA.C.E.personfollow-upinterviewdeterminedthatthematchedpersonwithunresolvedresidencestatuswasnotaresidentinthishousingunitorinthisblock cluster.TheE-samplepersonwasanerroneousenumeration.KE=MatchnotattemptedfortheE-sampleperson.Thenamewasblankorincompleteorthenamewascomplete,butthepersonhadonlyonecharacteristic.Thename wasincompleteornotavalidname,suchasChildJones,orMickeyMouse. | |||
13TheE-samplepeoplewhowereduplicatedwithnon-E-samplepeoplewerenotfullerroneousenumerations.SeethesectiononDuplicateSearchWithinClusterinthischapterforadiscussionoftheprobabilityoferroneousenumerationwhentherewasduplication betweenacensuspersonintheEsampleandanon-E-sampleperson.4-30SectionIChapter4A.C.E.FieldandProcessingActivitiesU.S.CensusBureau,Census2000 UnresolvedUE=NotenoughinformationwascollectedduringtheA.C.E.personfollow-upinter-viewtoidentifytheE-samplepersonascorrectlyorerroneouslyenumeratedinthe blockcluster.TheenumerationstatusfortheE-samplepersonwasunresolved.The UEcodewasalsousedwhenthepersondidnotliveatthesampleaddressonCen-susDayandtheCensusDayaddresswasnotcompleteenoughtodetermineifthe CensusDayaddresswasinthesamplecluster.Thiscodewasalsousedwhenthe E-samplepersonwasfolloweduptocollectgeographicinformationandthatinfor-mationwasnotcollected.MU=TheE-samplepersonwasmatched,butthefollow-upinterviewobtainednousefulinformationtoresolvetheresidencestatusforthematchedpersonwhohadaresi-dencestatusofunresolvedbeforefollow-up.TheE-samplepersonsenumeration statuswasunresolved.P=Therewasnotenoughinformationcollectedduringthefollow-upinterviewtodetermineifthepossiblematchwasamatchornot.Thematchstatusofthe P-samplepersonwasunresolved.GU=Thegeographicworkforthetargetedextendedsearchwasunresolved.Thecodehadthesamedefinitioninboththebeforeandafterfollow-upmatching.Thedif-ferencewasinafterfollow-up,thecodewasonlyusedinthelist/enumerateclus-ters.Thefieldworkforthetargetedextendedsearchwasnotdoneortheblock numberontheformwasnotinthesurroundingblocks,intheblockcluster,oron themap.Itwasnotclearwherethehousingunitwaslocated.SectionIChapter44-31A.C.E.FieldandProcessingActivitiesU.S.CensusBureau,Census2000 Attachment7.FinalP-SamplePersonResidenceStatusCodes ResidentM=TheP-sampleandcensuspeoplewerematched.MR=Thefollow-upinterviewdeterminedthatthematchedpersonwithunresolvedresidencestatuswasaresident.NR=TheP-samplepersonwasnotmatchedandwasidentifiedasaresidentintheblockclusteronCensusDayduringtheA.C.E.personfollow-upinterview.The P-samplepersonwasmissedinthecensus.NP=TheP-samplepersonwasnotmatchedtoacensusperson.Therewasnofollow-upforthewholehouseholdnonmatchesfrompersoninterviewswithhousehold membersandthewholehouseholdnonmatcheswerenotconflictinghousehold nonmatches.Thesepeoplewereconsideredresidentsofthehousinguniton CensusDay.NC=TheP-samplenonmatchwasfoundonthecensusroster.Thispersoninapartialnonmatchhouseholdwasnotmatchedtothecensusbecauseonlynamewas collectedinthecensusforthispersoninalargehouseholdandthecensuspersonwasnotdata-defined.Nofollow-upinterviewwasnecessary. | |||
NonresidentFP=TheP-samplepersonwasfictitiousinthisblockcluster.Thepersonwasinter-viewedinerrorduringthepersoninterview.Thispersonwasnotincludedinthe finalPsample.NL=TheP-samplepersondidnotliveatthesampleaddressorintheblockclusteronCensusDayandwaslistedasanonmoveroroutmoverinerror.ThispersonwasremovedfromthelistofP-samplepeople,sinceheorshewascollectedduringthe personinterviewinerror.NN=TheP-samplepersonwasidentifiedasanonresidentintheblockclusteronCensusDayduringtheA.C.E.personfollow-upinterview,becausethepersonlived ingroupquartersonCensusDayorhadanotherresidencewherethepersonshouldhavebeencountedonCensusDayaccordingtocensusresidencerules.ThispersonwasremovedfromthelistofP-samplepeople,sinceheorshewas collectedduringthepersoninterviewinerror.DP=TheP-samplepersonwasaduplicateofanotherP-sampleperson. | |||
MN=TheA.C.E.personfollow-upinterviewdeterminedthatthematchedpersonwithunresolvedresidencestatuswasnotaresidentinthishousingunitorinthisblock cluster.ThepersonwasnolongerinthelistofP-samplepeople.GP=TheP-samplepersonwasremovedbecausethepersoninterviewwasconductedatahousingunitthatexistsoutsidethesamplecluster.Thepersonfollow-up identifiedthishousingunitasaP-samplegeocodingerror.4-32SectionIChapter4A.C.E.FieldandProcessingActivitiesU.S.CensusBureau,Census2000 UnresolvedMU=TheA.C.E.personwasmatched,butthefollow-upinterviewobtainednousefulinformationtoresolvetheresidencestatusforthematchedpersonwhohada residencestatusofunresolvedbeforefollow-up.TheP-samplepersonsresidence statuswasunresolved.NU=TheP-samplepersonwasnotmatched.NotenoughinformationwascollectedduringtheA.C.E.personfollow-upinterviewtoidentifytheP-samplepersonasa residentornonresidentintheblockcluster.TheresidencestatusfortheP-sample personwasunresolved.ThiscodewasalsousedwhentheP-samplepersonwas followeduptocollectgeographicinformationandthatinformationwasnotcol-lected.TheNUcodewasalsousedwhenthepersondidnotliveatthesample addressonCensusDayandtheCensusDayaddresswasnotcompleteenoughto determineiftheCensusDayaddresswasinthesamplecluster.P=Therewasnotenoughinformationcollectedduringthefollow-upinterviewtodetermineifthepossiblematchwasamatchornot.Theresidencestatusofthe P-samplepersonwasunresolved.KI=MatchnotattemptedfortheP-sampleperson,becausethepersonhadinsufficientinformationformatchingandfollow-up.Thenamewasblankorincompleteorthenamewascomplete,butthepersonhadonlyonecharacteristic.Thiswasacomputerassignedcodeandthesepeopleweresuppressedfromviewbythe | |||
matchers.KP=MatchnotattemptedfortheP-sampleperson,because(1)thenamewasincom-plete,suchasMr.Jones,or(2)thenamewasnotavalidname,suchasWhite FemaleorDonaldDuck.Thiswasaclericallyassignedcode.SectionIChapter44-33A.C.E.FieldandProcessingActivitiesU.S.CensusBureau,Census2000 FORM D-1302 (6-23-99)Section 4-LISTINGPAGE Hello, Im (Your name) from the U.S. Bureau of the Census. Heres my identification. We are listing addresses as part of the Census 2000, and I have a few questions to ask you. | |||
Fill items 9 and 10 in areas without city style addresses (see cover, Section 1, item 5). | |||
(9) Householder name MI Last MULTI-UNIT ADDRESS | |||
'(15a)Canvass the multi-unit basic address and enter the number of units on each floor. | |||
Basement (15d)If there is a difference between your canvass and respondent total, resolve the difference. | |||
Enter the correct total in 16. | |||
(16)Total number of housing units, occupied or vacant, at | |||
this basic address. | |||
HH member* | |||
Proxy*(1)Line No.(2)Block No.(6)Map Spot No.- | |||
Do not fill for Mobile Home Park. | |||
(3)House No.(4a) Road/street name (5a) Rural (7)PO Box No. | |||
(8)ZIP Code (11a)How would you describe this type of address? | |||
(12)REMARKS (Do NOT use this | |||
space for location description.) | |||
1st floor 2nd floor 3rd floor 4th floor (17)Information obtained from: | |||
6th floor Manager*Observation (10)Physical location description or E-911 address (Maximum 50 characters) 1One family house (occupied or vacant)- | |||
Detached or attached to one or more | |||
buildings - | |||
Go to 11b. | |||
Basic address with two or more housing units(example: apartment house)- | |||
Go to 15a. | |||
However, if under construction or future construction, skip 15 and go to 16. | |||
Mobile home/trailer, NOT in a park-Go to 17.One family house (occupied orvacant) in special place-In 12, enter name of special place and contact person. Then go to 13. | |||
Storage of household goods - Go to 17.SINGLE UNIT ADDRESS (14)Besides the unit(s) you have just mentioned, has this address been converted into apartments where other people might live?Yes - How many ADDITIONAL apartments? | |||
...'If sum of 13 plus 14 is:1 - | |||
Go to 16 and 17. | |||
No 10th floor (15b)Total units from your canvass.(15c)How many apartments, occupied or vacant, are there at (basic address) | |||
?8th floor 9th floor Attic If more floors, enter additional floor information (example: "5 APTS 14th FLR") | |||
7th floorMobile home/trailer park-Go to Mobile Home Park Page, Section 6.2 or more-Change item 11a to "Basic address with two or more housing units" and go to 15a. | |||
1 (11b)Unit status Other, for example: | |||
Occupied camper,tent, van, boat, etc.- | |||
Go to 17.Under construction (started) . | |||
Other - Go to 12, then 17.1 2 4 5 6 2 3 4 5 7 If multi-unit, go to the Multi-Unit Page. | |||
OFFICE USE ONLY 5th floor11th floor12th floor 12 1 2 3 4 Go to 12, then 17. | |||
(13)How many (housing units/living quarters/apartments), occupied or vacant, are there at (basic address) | |||
? Example: basement apartment, garage apartment.. | |||
Occupied, or vacant and intended for | |||
occupancy - | |||
Go to 13.0001 8 (5b) Box No.NumberLetter Basic address with two or morehousing units in special place-In 12, enter name of special place and | |||
contact person. Then go to 15a. | |||
6 Unfit for habitationBoarded up. | |||
Future construction (not started) 3*Respondent nameTelephone No. | |||
Rte. No.First Fill items 9 and 10 in areas without city style addresses (see cover, Section 1, item 5). | |||
(9) Householder name MI Last MULTI-UNIT ADDRESS | |||
'(15a)Canvass the multi-unit basic address and enter the number of units on each floor. | |||
Basement (15d)If there is a difference between your canvass and respondent total, resolve the difference. | |||
Enter the correct total in 16. | |||
(16)Total number of housing units, occupied or vacant, at this basic address. | |||
HH member* | |||
Proxy*(1)Line No.(2)Block No.(6)Map Spot No.- | |||
Do not fill for Mobile Home Park. | |||
(3)House No.(4a) Road/street name (5a) Rural (7)PO Box No. | |||
(8)ZIP Code (11a)How would you describe this type of address? | |||
(12)REMARKS (Do NOT use this space for location description.) | |||
1st floor 2nd floor 3rd floor 4th floor (17)Information obtained from: | |||
6th floor Manager*Observation (10)Physical location description or E-911 address (Maximum 50 characters) 2One family house (occupied or vacant)- | |||
Detached or attached to one or more | |||
buildings - | |||
Go to 11b. | |||
Basic address with two or more housing units(example: apartment house)- | |||
Go to 15a. | |||
However, if under construction or future construction, skip 15 and go to 16. | |||
Mobile home/trailer, NOT in a park-Go to 17.One family house (occupied orvacant) in special place-In 12, enter name of special place and contact person. Then go to 13. | |||
Storage of household goods - Go to 17.SINGLE UNIT ADDRESS (14)Besides the unit(s) you have just mentioned, has this address been converted into apartments where other people might live?Yes - How many ADDITIONAL apartments? | |||
...'If sum of 13 plus 14 is:1 - | |||
Go to 16 and 17. | |||
No 10th floor (15b)Total units from your canvass.(15c)How many apartments, occupied or vacant, are there at (basic address) | |||
?8th floor 9th floor Attic If more floors, enter additional floor information (example: "5 APTS 14th FLR") | |||
7th floorMobile home/trailer park-Go to Mobile Home Park Page, Section 6.2 or more-Change item 11a to "Basic address with two or more housing units" and go to 15a. | |||
1 (11b)Unit status Other, for example: | |||
Occupied camper,tent, van, boat, etc.- | |||
Go to 17.Under construction (started) . | |||
Other - Go to 12, then 17.1 2 4 5 6 2 3 4 5 7 If multi-unit, go to the Multi-Unit Page. | |||
OFFICE USE ONLY 5th floor11th floor12th floor 12 1 2 3 4 Go to 12, then 17. | |||
(13)How many (housing units/living quarters/apartments), occupied or vacant, are there at (basic address) | |||
? Example: basement apartment, garage apartment.. | |||
Occupied, or vacant and intended for occupancy - | |||
Go to 13.8 (5b) Box No.NumberLetter Basic address with two or morehousing units in special place-In 12, enter name of special place and | |||
contact person. Then go to 15a. | |||
6 Unfit for habitationBoarded up. | |||
Future construction (not started) 3*Respondent nameTelephone No. | |||
Rte. No.First A.C.E. Field and Processing Activities U.S. Census Bureau, Census 20004-34Section I Chapter 4 Figure 4-1. | |||
Address Listing Book Page for Single and Multiunit Structures | |||
999 Chapter5.TargetedExtendedSearch INTRODUCTIONTheconceptbehindthedualsystemestimateistoesti-matethecensusomissionrateusingthePsampleandthe erroneousenumerationrateusingtheEsample.Thecom-pletedefinitionofbeingomittedfromorerroneouslyenu-meratedinthecensusincludestheconceptoflocation, thatis,asuccessfulenumerationmusthavelocatedthepersonintherightplace.Rightlocationinthecensusmeansanywhereintheblockwherethereportedhousing unitaddresswaslocated,orinthesearcharea,defined asoneringofadjacentblocks.TheoperationconcernedwithlocatingandmatchingthepersonsinthesurroundingareasisTargetedExtendedSearch,orTES.Thename waschosenbecause,unlikethesimilarprocedureinthe1990Post-EnumerationSurvey(PES)wherethesurround-ingareaofeveryclusterwassearched,theA.C.E.search wastargetedintwoways:1.Resultsfromtheinitialhousingunitmatchingopera-tionwereusedtoselectthehousingunitsthatarecandidatesforTES.2.Inmostcases,onlyclustersthatincludeTES-eligiblehousingunitswereincludedinTES.ThischapterfocusesonthestatisticalmethodsusedinTES.A.C.E.fieldandprocessingactivities,includingTES, aredescribedinChapter4. | |||
OverviewThe1990PESincludedasearchinallblockssurroundingeachsamplecluster.Everypersonineveryhouseineveryblockadjoiningeverysampleblockclusterwasincluded inthesearch.Thiswasdeterminedtobeburdensomein termsoftime,cost,andperhapsmentalfatigueonthepartofmatchersperforminglow-payoffsearches(Hogan,1993).Toimproveefficiency,theCensus2000A.C.E.took amorefocused(i.e.targeted)approachinselectingclus-ters,definingsearchareas,anddeterminingwhichhous-ingunitsandresidentswouldbepartofsurrounding blockoperations.TheCensus2000A.C.E.searchoperationdifferedfromthe1990PESinfourprimaryways:1.Searchareadefinition.2.Amountofsearching.3.Personseligibleforsearch.4.TESweighting.SearchAreaDefinitionThesearchareaforthe2000A.C.E.waslimitedtoeitherjustthesampleblockclusteroroneringofadjacentblocks.Anadjacentblockisonethattouchestheclusterofsampleblocksatoneormorepoints.Thisdefinition includestheblocksthattouchthecorneroftheblockclus-ter.Resultsfromempiricalresearch,usingCensus1998DressRehearsaldata,showthattheadditionalbenefits ofusingtworingsofsurroundingblocksarenegligible(Wolfgang,1999).AmountofSearchingThereweretwoimportantdifferencesbetweentheextentofsearchinginthe1990PESandthe2000A.C.E.:1.Onlyabout20percentofA.C.E.blockclustershadtheirsurroundingareassearched,whereasin1990the surroundingareaofeveryblockclusterwassearched.2.Thesearchwastargeted(inmostcases)toonlyhousingunitsidentifiedasbeinglikelytoexhibit geocodingerror;in1990,allpersonsinsurrounding areaswereeligibleforsearch.Theclusterswithahighnumberofpotentialgeocodingerrorswereidentifiedfromtheresultsoftheinitialhous-ingunitmatchingoperationandsubsequentfieldfollow-up(seeChapter4).ThesewereA.C.E.blockclus-terswithalargenumberofIndependentListinghousing unitsnotfoundintheJanuary2000CensusAddressList.Thesetypesofnonmatchesarepossiblycensusgeocodingerrorsofexclusion(i.e.notincludedinthecensuswithin thesampleareaalthoughtheyshouldhavebeen).Onthecensusside,A.C.E.blockclusterswithalargenumberofcensusgeocodingerrorsarelikelytobeerrorsofinclusion (i.e.reportedbythecensusintheblockcluster,although theunitisphysicallyoutsidetheA.C.E.blockcluster).ThesetwotypesofhousingunitswereeligibletobeintheextendedsearchaspartofTESoperations,andarethus TES-eligiblehousingunits.Anyclusterthatincludedatleastonepotentialcensusgeocodingerror,ofeitherinclusionorexclusion,waseli-gibletohaveTESoperationsperformedinitandistermed aTES-eligiblecluster.Clusterswithnosuchpotentialgeocodingerrorsbecame non-TES-eligible.TheclustersinwhichTESwasactuallydoneareTESclusters, andwereselectedfromamongtheTES-eligibleclusterseitherwithcertaintyorbyprobabilitysampling.SectionIChapter55-1TargetedExtendedSearchU.S.CensusBureau,Census2000 Resultsfromthe1990PESshowthatgeocodingerrorsarehighlyclustered.Slightlyover77percentofthewhole householdnonmatcheswereconcentratedinlessthan one-fourthofthePESsampleblockclusters.Ontheother hand,about72percentofthecensusgeocodingerrors werefoundinlessthan3percentofthePESsampleblock clusters.TESisagoodexampleofDemings80-20 guideline80percentofthebenefitsarerealizedbysolv-ing20percentoftheproblems.PersonsEligibleforSearchInordertobeincludedinTESoperations,apersonmustlivein:*aTEScluster;and*aTES-eligiblehousingunitApersoninahousingunitthatwasnotaTES-eligiblehousingunit,wasanon-TESperson,andthus,wasnot directlyaffectedbyTESoperations.AnypersoninaTES-eligiblehousingunitwasaTESperson,unlesssomeoneinthehousingunitmatched(i.e.someoneis confirmedtobenotaTESperson).TESpersonsinclusters thatwerenotselectedforTESoperationswereidentified,butdidnothaveTESoperationsapplied.Instead,thesecaseswereeffectivelyremovedfromthesamplebyhaving anassignedweightofzero.TheywererepresentedbypersonsinotherTESclustersselectedbysampling.TESWeightingEveryselectedTESclusterwasassignedasamplingweightequaltothereciprocalofitsselectionprobability.ThisTESclusterweightwasassignedtoallTESpersonsinthat clusterandwasmultipliedbytheirA.C.E.samplingweightstoproducetheirTES-adjustedweights.TheTES-adjustedweightforTESpersonsinclustersnotselected forTESiszero.Inthisway,theTESpersonsintheTESclustersrepresenttheTESpersonsinnon-TESclusters.Allelementsofthedualsystemestimate(DSE)calculation, exceptthoseinvolvinginmovers,canbeaffectedbythe TESweightingbecauseTESpersonscanbenonmoversoroutmovers,matchesornonmatches,andcorrectorerroneousenumerations.CLUSTERSAMPLINGThedecisiontoselect20percentoftheA.C.E.blockclus-tersforTESwasbasedontheassumptionthatmostoftheTES-eligiblehousingunitsandpersonswouldbeconcen-tratedinasmallfractionoftheblockclusters.Hence, mostofthebenefitsofacompletesurroundingarea searchcouldberealizedatasubstantialreductionincost,ifadisproportionateshareoftheeffortwasconcentratedintheclusterswiththegreatestlikelypayofftheones withthemostTES-eligiblehousingunitsandpersons.Targetingtheseclusterswouldachieveoneoftheprinci-palgoalsofthesurroundingareassearchvariancereduction.However,itisofatleastequalimportancethatthesurroundingareasearchbebalanced.Therearetwo waysTEScouldhavebeenoutofbalance:1)thegeo-graphicalareaincludedinthesearchcouldhavediffered betweenthePandEsamples;2)theTESblockcluster samplingcouldhaveselectedclusterscontainingerrorsof inclusionwithgreaterorlesslikelihoodthanclusterswith errorsofexclusion.Toachievethebalancinginsample selection,itwasnecessaryforeachclusterwith TES-eligiblehousingunitsandpersonstohavesome probabilityofbeingselectedforTESandbeweightedby theinverseoftheselectionprobability.TheinformationavailableforTESselectionincludedtheresultsoftheinitialhousingunitmatching,whichincludedtheresultsfromthehousingunitfollow-up. | |||
Housingunitfollow-upindicated,amongotherthings,thecountofpotentialgeocodingerrorsofinclusionandexclu-sion.Thegeocodingerrorsofinclusionwerecensusunits foundoutsidethecluster.Potentialgeocodingerrorsofexclusionwerecodedasaddressnonmatchesintheindependentlisting.Thecombinednumberofcensus geocodingerrorsandindependentlistingaddressnon-matcheswereconsideredtobethenumberofpotentialgeocodingerrorsineachcluster.Theprobabilitythatany clusterwouldbeselectedforTESdependedonthecount ofitspotentialgeocodingerrorsformostclusters.ExceptionsarerelistedclustersandclustersthatwereenumeratedinthecensususingtheList/Enumeratemeth-odology.Thoseclustersdidnotgothroughhousingunitmatchingandfollow-up.AhousingunitthatrepresentedapotentialgeocodingerrorcouldhavebeendiscoveredbyTESoperationstobeageocodingerrororanactualcoverageerror.Puttinga particularhousingunitinthecategoryofpotentialmadeit,andthepersonslivinginit,eligibleforTES.Thissearchwasintendedtodeterminewhetherthehousingunitand personsweregeocodedincorrectlyintoaneighboring block,inwhichcasetheywouldbecountedascorrectlyenumerated,orweretrulyenumerationerrors.Hence,thefollowingTESselectionstrategywasimple-mented:*ClustersthatdidnothavecountsofpotentialgeocodingerrorsavailableatthetimeoftheTESsamplingopera-tionwereassignedtoaseparateTESprocedure.Clus-tersthatwererelisted(whichwerelaterincludedinTESwithcertainty)orenumeratedusingtheList/Enumeratemethodology(whichwereultimatelyexcludedfromTES) fallintothisgroup.*The5percentofclustersthatincludedthelargestnum-berofhousingunitsthatwerepotentialgeocoding errorswereincludedinTESwithcertainty.*The5percentofclustersthathadthemosthousingunitsthatwerepotentialgeocodingerrors,whenweightedbytheirA.C.E.clusterweight,werealso5-2SectionIChapter5TargetedExtendedSearchU.S.CensusBureau,Census2000 includedinTESwithcertainty.The5percentofclustersincludedintheabovebulletforhavingthemost unweightedcaseswereexcludedbeforethisstepwas performed,sothatatotalof10percentoftheA.C.E. | |||
clusterswereselectedbasedonthetwocertainty | |||
criteria.*Allclusterswithatleastonepotentialgeocodingerrorhousingunitwereassignedtoanoncertaintystratumto besampledatauniformnationalratetobeincludedin TES.Thesamplingratewassetsothattheoverallsize oftheTESsample,includingthoseselectedbycertainty andbysampling,totaled20percentofA.C.E.clusters (excludingthefirstgroup).ClusterswithnopotentialgeocodingerrorswereexcludedfromTESselectionsincetherewerenohousingunitsor personsthatwerecandidatesforTESoperations.ThiscreatesapotentialforasmallbiasinTES,becausehous-ingunitsaddedtoordeletedfromtheaddresslistsaftertheselectionofTESclusterswerenoteligibleforTES operations.SamplingMethodologyFortheUnitedStatesasawhole,therewere11,303A.C.E.clusters.Ofthese,420wereexcludedfromTESselection becausetheyusedtheList/Enumeratecensusmethod.Oftheremaining10,883clusters,20percent,or2,177wereselectedforTES.Oftheeligibleclusters,62wererelist clustersandwerenotpartofthenormalTESselection. | |||
(Theseclustersdidnotcountaspartofthe2,177TEStargetsamplesize.)Fivepercentofthesamplinguniverse,or544clusters,withthemostpotentialgeocodingerrors wereselectedforTESwithcertaintyandassignedaTESweightof1.Oftheremainingclusters,anadditional544withthemostpotentialgeocodingerrors,whenweighted bytheA.C.E.clusterweight,werealsoselectedwithcertaintyandassignedaTESweightof1.Oftheremainingclustersthatincludedatleastonepoten-tialgeocodingerror,1,089wereselectedusingsystematicrandomsamplingwithequalprobability.Therewere5,326 clustersinthenoncertaintystratum(i.e.allthosethatwerenotalreadyselectedbyoneoftheothermeansandthatcontainedatleastonepotentialgeocodingerrors), | |||
sotheselectedclusterswereassignedaTESweightof5,326dividedby1,089or4.8907.Theremaining4,407clusterswereoutofscopeforTESbecausetheyhadno identifiedpotentialgeocodingerrors.Forpurposesofdrawingthesystematicsample,clustersweresortedintheorder:*State*First-phaseSamplingStratum | |||
*Second-phaseSamplingStratum*SmallBlockClusterSamplingStratum*ClusterNumber ThefirstfourcharacteristicsarethesameonesusedtoselecttheA.C.E.sample.Sortingclustersinthisorderfor TESimprovedtherepresentativenessofTESwithrespecttothenationalA.C.E.sample.Aftersortinginthisorder,theclustersweresystematicallysampledwithequalprob-abilityusingatake-everyof4.8907andwereassignedaTESweightequaltothatfigure.ResultsofClusterSamplingTheTESsampleincluded2,239blockclustersoutof11,303,or19.8percent.(OriginallyithadbeenintendedtoincludeasmallnumberofList/EnumerateclustersinTES,andsomesamplewassetasideforthembutnever used.)Theclustersincluded45,000E-sampleand77,000P-samplehousingunits,representing80and73percentofTES-eligibleunitsintheirrespectivesamplesbeforesub-samplingwithinlargeblockclusterswasperformed. | |||
Becauseofdifferencesinprocedures,moreE-sampleunitsgotintoTESbycertainty(76percentversus66percent),whilemoreP-sampleunitswereselectedbysampling,7 percentto5percent.TESunitsrepresentabout7percentofthehousingunitsresultingfrominitialhousingunitmatching.(SeeTable5-1.)Thiswasnotthefinalnumber ofhousingunitsincludedinTESfieldoperationsbecause:*SubsamplingwithinlargeblockclustersreducedthenumberofA.C.E.housingunitsinclusterswith80or morehousingunits;and*HousingunitcountsforRelistclusterswerenotavailableatthetimethesamplewasselectedSubsamplingwithinlargeblockclustersreducedthefinalTESworkloadto12,000E-sampleand18,000P-sample housingunits.SectionIChapter55-3TargetedExtendedSearchU.S.CensusBureau,Census2000 Table5-1.TESSamplingFrameandSelectionResults ClustersPotentialgeocodingerrorsTotalpotentialerrorsErrorsofinclusionErrorsofexclusionNumberPercentNumberPercentNumberPercentTotal.................11,303122,44010045,05310077,387100 Out-of-scope | |||
..............4,8270...0...0...List/Enumerate.......4200...0...0...NoTESHUs.........4,4070...0...0...EligibleforTES. | |||
...........6,476122,44010045,05310077,387100 Certainty..................1,15085,3097034,0897651,22066Topweighted.........54411,858104,03797,82110Topunweighted......54473,4516030,0526743,39956 | |||
Relist...............620*...0*...0*... | |||
Noncertainty | |||
..............5,32637,1313010,9642426,16734Selectedintosample..1,0897,64262,10655,5367 Notselected | |||
.........4,23729,489248,8582020,63127TESclusters | |||
..............2,23992,9517636,1958056,75673*TESunitsinRelistclustershadnotbeendeterminedatthetimethesamplewasselected.Note:Percentagesintablemaynotaddtototalduetorounding.TESFIELDANDPROCESSINGACTIVITIESDetailsontheoperationsinvolvedinTESaredescribedinChapter4.Insummary,themainactivitiesare: | |||
*Clusterselection(Spring2000).ThisoperationselectstheclustersforTES.Becauseoftheneedto selecttheclustersampleataparticulartime,thefinalEandPsampleshadnotbeenselectedatthetimeofthisoperation. | |||
*Searchforcensusunitsinsurroundingblocks(Summer2000).Determinesifcensusunitserrone-ouslyincludedinthesampleblockclusterarelocatedwithinthesurroundingringofblocks.Thisfieldopera-tionisdescribedmorefullyinChapter4. | |||
*IdentifyTESPersons(Fall2000).AnautomatedactivityperformedattheNationalProcessingCenterinJeffersonville,Indiana.SeeChapter4formore information. | |||
*ExtendthesearchareatosurroundingblocksforTESpersons(Fall2000).TheP-sampleTESpersonswereallowedtomatchtocensusrecordsinthesur-roundingblock.TheE-sampleTESpersonsweretreated ascorrectenumerationsifthecensusunitwaslocatedinasurroundingblock.Thiswasaclericaloperation. | |||
*AssignTESweights(Winter2000/2001).TESper-sonsidentifiedinTES-eligibleclusterswereassignedthe TESweightassociatedwiththatcluster,either1.0foraclusterselectedwithcertaintyor4.8907foraclusterselectedbysampling.TESpersonsinTESclustersnot selectedintothesamplewereassignedazeroweight.AddsandDeletesThepreliminarycensusaddresslistofhousingunitsasofJanuary2000wasthesourcefortheinitialhousingunitmatchingonwhichTESisbased.SincesomehousingunitsontheJanuary2000listwerelaterdeletedandothers added,thefinallistofcensushousingunitsdidnot exactlymatchtheinitialhousingunitmatchingcounts ofpotentialgeocodeerrors.Therefore,procedureswere necessarytoupdatetheTESidentificationsforaddsand | |||
deletes.Inthevastmajorityofcases,whereaddsanddeleteswerenotinvolved,P-samplehousingunitsareTES-eligibleiftheydidnotmatchtoacensusaddress.However,ifaP-sampleunitwasmatchedtoanaddressduringinitialhousingunitmatching,butthataddresswasdeleted,then theunitwasconsiderednonmatched.Toadjustfordele-tions,P-samplepersonsinhousingunitsthatwerematchedtodeletedcensushousingunitswereflagged asTESpersons,aslongastheunitdidnotcontainanypersonsmatchedwithinthesampleblock(i.e.non-TESpersons).Thisadjustmentwasperformedonlyonpersons inTESclusters.E-samplehousingunitsthatwereaddedtothefinalcen-suslistafterJanuary2000couldrepresentgeocodingerrors,buttheywerenotpartofTESfieldoperations. | |||
Withoutfieldoperations,personsinsuchunitswouldneverbeidentifiedassurroundingblockcorrectenumera-tions.Therefore,acorrectenumerationprobabilitywas imputedforsuchpersonsinTESclusters.Theimputed probabilityistheoverallcorrectenumerationprobabilityofallresolvedpersonsingeocodingerrorhousingunitsintheTESsample.SeeChapter6foradescriptionofthepro-cedure.5-4SectionIChapter5TargetedExtendedSearchU.S.CensusBureau,Census2000 Table5-2.EffectofCensusAddressListChangesafterJanuary2000CountWeightedMatches/correct enumerationsPsample-personsinhousingunitsmatchedtodeletes..2,3192,036,564675,892Esample-geocodeerroradds.........................5315,30714,915TESINDUALSYSTEMESTIMATIONAccountingforTESintheDSEcalculationisprimarilyamatterofapplyingweightsproperly.Everypersoninthe A.C.E.iseitheraTESpersonoranon-TESperson,andeveryA.C.E.clusteriseitheraTESclusteroranon-TEScluster.EveryTESpersonisassignedtheTESweightofhis A.C.E.cluster.ThecalculationoftheDSErequirestheuseofsevendistinctcomponents,allbutoneofwhichrepre-sentsthesumoftheA.C.E.weightsforsomegroupof personsintheA.C.E.,includingbothTESandnon-TESper-sons.Hence,sixofthesevencomponentsrepresentsaweightedsumofTESandnon-TESpersons,theformer withtheirTESclusterweightsapplied.ApplyingTESWeightsEveryA.C.E.clusterincludingTESpersonshasaTESweight,althoughthatweightiszeroiftheclusterisnotselectedforTES.ATESpersonmustbeweightedbythe associatedTESweight.TheA.C.E.weightismultipliedbytheTESweighttoproduceapersonweight.TESweightingdoesnotaffecttheweightofnon-TESpersons.Theirindi-vidualweightsarethesameastheA.C.E.weights.Table5-3.TESWeightsbyTESStatusofthePersonandClusterTESclusterNon-TESclusterTESpersons...........1,ifclusterinTESwithcertainty 04.8907,ifclusterselectedforTESbysampling0Non-TESpersons.......11Theissuesrelatedtoinmovers,outmoversandnoninter-viewsarethesameforTESpersonsasforallotherper-sons.Fromacalculationstandpoint,theonlyeffectthat TESstatushasonthedualsystemestimatesisinapplyingtheclustersTESweight.DSECalculationTheDSEforCensus2000is: | |||
DSE(DD)(CE N e)(N nN i M n(M o N o)N i)where DDcensusdata-definedpersons CEestimatednumberofA.C.E.E-Samplecorrectenumerations N enumberofA.C.E.E-Samplepersons N nestimatednumberofA.C.E.P-Sample nonmovers N iestimatednumberofA.C.E.P-Sampleinmovers N oestimatednumberofA.C.E.P-Sample outmovers M nestimatednumberofA.C.E.P-Samplenonmovermatches M oestimatednumberofA.C.E.P-SampleoutmovermatchesTheestimatorhassevenA.C.E.distinctcomponents(plusDDfromthecensusenumeration).Sixofthesevencom-ponentsrepresentaweightedsumofpersons,including bothTES-andnon-TESpersons.Otherthaninmovers,whocannotbeTESpersons,eachoftheDSEcomponentsisexpressedas: | |||
ni1 n pj1 w ij*m ij x ijni1 n pj1 w ij*m ij y ijni1 n pj1 w ij*t ij m ij z ijwhere iclusterindex jpersonindex nnumberofblockclustersintheA.C.E.sample n pnumberofpersonsinblockclusteri x ij1ifthepersonisnotaTESperson,0 otherwise y ij1ifthepersonisaTESpersonandisintheTESsamplewithcertainty,0otherwise z ij1ifthepersonisaTESpersonandisintheTESsystematicsample,0otherwise m ijcharacteristicofinterest,match,correctenumeration,E-sampleperson,orP-sampleperson w ij*weightusedforestimation(includesinverseoftheprobabilityofselectionforA.C.E.,adjustmentforhouseholdnoninterview andweighttrimming) t ijTESsamplingweight,theTESsystematicsample take-everyEFFECTSOFTESONDUALSYSTEMESTIMATIONTheprincipaleffectofTESinCensus2000isapproxi-matelywhatwasexpected-theoverallcorrectenumera-tionratewas2.9percenthigherwithTES,thanitwouldSectionIChapter55-5TargetedExtendedSearchU.S.CensusBureau,Census2000 Table5-4.EffectofTESattheNationalLevelWithTESWithoutTESDifference*EffectofTES(1)(2)(1)-(2)(1)/(2)EsamplePersons(N e).........................................264,578,862264,634,794(55,932)**1.000CorrectEnumerations(CE) | |||
............................252,096,238244,387,9517,708,2881.032CERate(%). | |||
........................................95.392.32.91.032PsamplePersons(N p).........................................263,037,259262,906,916130,343**1.000 Matches.............................................240,878,622230,681,20510,197,4181.044MatchRate(%). | |||
.....................................91.687.73.81.044RatioofCEtoMatchRate | |||
.................................1.0401.053(0.012)0.989CoefficientofVariationforRatio... | |||
.........................0.1290.314(0.197)0.405*Percentageswerecalculatedonunroundedvalues.**TheweightedE-andP-samplesizesdifferedslightlybecauseofvarianceinTESsampling.Note:Tableabovereflectsnationaltotalswithoutregardtopost-stratificationanddiffersfromothertotalsinwhichpost-stratumtotalswere aggregated.havebeenwithout,andtheoverallmatchratewas3.8percenthigher(seeTable5-4).Thelargerincreaseinthe matchrate,ascomparedtothecorrectenumerationrate,occurredbecausethereweremoreidentifiedpotentialgeocodingerrorsinthePsamplethanintheEsample.Thedifferenceinthenumberofmatchesversuscorrectenumerations(10.2millionand7.7million,respectively)fromTEShadbeenasourceofconcern,sinceitsuggested thepossibilityofbalancingerror.BalancingerrorwouldhaveoccurredifthegeographicboundariesincludedinthePandEsampleshadnotbeenconsistent.Forinstance, supposethePsamplewasallowedtomatchtocensusper-sonsinhousingunitsbeyondthefirstring,whileacensusunitcouldonlybeclassifiedcorrectifitwaswithinthefirstring.AdamsandLiu(2001)performedanevaluation studyoftheP-samplehousingunitsinA.C.E.andcon-cludedthatthemainsourceofthemeasuredimbalancewasgeocodingerrorinthePsample.Table5-4showsthatTESincreasedthenumberofcorrectenumerationsfrom244.4millionto252.1millionandmatchesfrom230.7millionto240.9million.BeforeTES, therehadbeen20.2millionerroneousenumerations,ofwhich7.7millionweregeocodingerrorsthatwereclassi-fiedascorrectenumerationsbyTES.TESalsoallowed 10.2millionadditionalP-samplematchestooccuroutof 32.2millionoriginalnonmatches.Improvingboththematchandcorrectenumerationratethismuchsignifi-cantlyimprovesthevarianceoftheDSE,sinceover90per-centofpeoplematchorarecorrectlyenumerated.Table5-5showsthesignificantcontributionthatTESmakestovariancereduction.FortheA.C.E.consideredas awhole(i.e.adirectDSEoftheentirepopulationwithoutpost-stratification),thecoefficientofvariationis0.129percentwithTESand0.314percentwithoutTES.Table5-5.EffectofTESonCoefficientofVariation(CV) | |||
Standard error CV (percent)WithTES.............................355,4510.129WithoutTES | |||
..........................877,6640.314Note:Tableabovereflectsnationaltotalswithoutregardtopost-stratificationanddiffersfromothertotalsinwhichpost-stratumtotals wereaggregated.Atthepost-stratumlevel,theaverageimprovementintheDSEstandarderrorisabout33percent.Thegainsinpreci-sionasmeasuredbyvarianceshowthatTESmakesdual systemestimatesmoreprecise,andthatTESimprovesthe qualityoftheA.C.E.,solongasitdoesnotmaketheDSEs lessaccuratebyintroducingbias.Thecoefficientofvaria-tionwasreducedforamajorityofthecollapsedpost-strata(448originalpost-stratawerecollapsedinto416 post-strataforDSEcalculationpurposes).Table5-6.EffectofTESonPost-StratumCVs | |||
[Percent]WithTESWithoutTESAverageCV..........................2.072.66MedianCV...........................1.812.32AverageCVweightedbycensuscount..1.301.935-6SectionIChapter5TargetedExtendedSearchU.S.CensusBureau,Census2000 Chapter6.MissingDataProcedures INTRODUCTIONThischaptergivesanoverviewofmissingdataproceduresfortheCensus2000AccuracyandCoverageEvaluation(A.C.E.).Generalbackgroundinformationispresented first,whilethefollowingsectionsdescribethreetypesofproceduresusedtoaccountfordatamissingintheA.C.E.Thenoninterviewadjustmentaccountsforwhole-householdnonresponse.Thenextsectiondescribesthecharacteristicimputationusedtoassignvaluesforspecificmissingdemographicvariables.Finally,forpersonswith unresolvedmatch,residence,orenumerationstatus,aprobabilityofmatching,residence,orcorrectenumerationwasassignedaccordingtoprocedures.AsmissingdataintheA.C.E.wereaddressedafterthecompletionofthefieldoperationsthatproducedtheA.C.E.datafiles,aknowledgeofthefieldactivitiesandthe circumstancesthatledtospecificoutcomesisnecessary tounderstandthemotivationfortheseprocedures.Forthisinformation,thereaderisreferredtoChapter4fordetailsonthefieldoperations.ThemissingdataproceduresusedintheCensus2000A.C.E.weresimilartothoseusedontheIntegratedCover-ageMeasurement(ICM)sampleintheCensus2000Dress Rehearsal.AnoutlineoftheICMproceduresandasum-maryofrelatedresearcharegiveninIkeda,Kearney,andPetroni(1998).KearneyandIkeda(1999)provideanover-viewoftheresultsfromtheDressRehearsal.Fordetailedmissingdataproceduresforthe2000A.C.E.,seeCantwell(2000)andIkedaandMcGrath(2001).Afewbasicresults onmissingdatafrom2000arefoundinthischapter; manymoreresultscanbefoundinCantwelletal.(2001). | |||
BACKGROUNDBeforedualsystemestimateswerecalculated,itwasnec-essarytoaccountformissinginformationfromtheinter-viewsofP-samplepeopleandfromthematchingopera-tions.Itshouldbenotedthatthetermmissingdataappliesafterallfollow-upattemptshavebeenmade.Chapter4describessomeoftheextensivefieldproce-duresconductedtominimizetheresultinglevelofsuchmissingdata.Theseactivities-allspecifiedinadvance-includedmultipleattemptsatinterviews,the useofhighlytrainedclerksandtechnicianstoresolve cases,andthefollowupofcaseswhereasecondinter-viewcouldprovideadditionalrequiredinformation.ThereweretwomaintypesofmissingdataintheA.C.E.andthreeprocessesusedtocorrectforthem.Thefirsttypewasunitmissingdata.ThesewerehouseholdsthatwerenotinterviewedintheA.C.E.eitherbecausetheycouldnotbecontactedorbecausetheinterviewwas refused.Thenoninterviewadjustmentprocessspreadtheweightsofthesehouseholdsamonghouseholdsthatwereinterviewedinthesamenoninterviewcell.Theothertypeofmissingdatawasitemmissingdata.Thissituationoccurredwhensomeinformationforahouseholdorpersonwasavailablebutportionsofthe dataweremissing.Twogroupsofmissingdataitemshadtobeaddressed:demographicitemsanditemsrelatingtoaspecificoperationalstatus.Missingage,sex,tenure, race,andHispanicoriginwereimputedtoallowthepro-ductionofestimatesofthecensusundercountbythesecharacteristics,andbecausetheywerenecessarytoassign peopletopost-strata.ForasmallnumberofpeopleinthePsample,therewasnotenoughinformationavailabletodeterminethematch status(whetherornotthepersonmatchedtosomeonein thecensusintheappropriatesearcharea)ortheresidencestatus(whetherornotthepersonwaslivingintheblockclusteronCensusDay).Determiningresidencestatuswas importantforthePsamplebecauseCensusDayresidentsoftheblockclustersinthesamplewereusedtoestimatetheproportionofthepopulationwhowerenotcountedin thecensus.Similarly,somepeopleintheEsamplelacked informationtodeterminewhetherthepersonwascor-rectlyenumerated.Suchcaseswherestatuscouldnotbedeterminedweresaidtobeunresolved.Generallyfor caseswithmissingstatusaprobabilityofresidence,match,orcorrectenumerationwasassignedbasedoninformationavailableaboutthespecificcaseandabout caseswithsimilarcharacteristics.Inthe1990Post-EnumerationSurvey,ahierarchicallogis-ticregressionprogramwasusedtocalculateprobabilities ofmatchandcorrectenumerationforcaseswithmissinginformation.(Duetotheprocedureusedtotreatmoversin1990,residencestatusplayedadifferentrolethen.)The modelandsomeresultsarediscussedinBelinetal. | |||
(1993).Duringcensustestsin1995and1996,certaincomponentsofmissingdatawereaddressedusinglogisticregression,whileforothercomponentsasimplerproce-durecalledimputationcellestimationwasused.ThelatterprocedurewasusedexclusivelyintheCensus2000DressRehearsalin1998.Datafromthesetestsindicatethatthe exactmethodofcalculatingprobabilitiesforunresolved status(match,residence,orcorrectenumeration)hasaSectionIChapter66-1MissingDataProceduresU.S.CensusBureau,Census2000 minoreffectonthedualsystemestimates.MoredetailsofthisresearchcanbefoundinIkeda(1997,1998,1998b, and1998c)andCantwell(1999).Basedonthesefindings andconcernsaboutimplementinglogisticregressionina productionenvironment,thesimplerprocedure(thatis, imputationcellestimation)wasusedtoestimatemissing dataitemsintheA.C.E.NoninterviewAdjustment(HouseholdLevel)AtthetimeoftheComputerAssistedPersonalInterview(CAPI),questionswereaskedtodeterminewholivedinthehouseholdonInterviewDayandwholivedthereonCensusDay,andamoverstatuswasassignedbasedon thereplies.Thustworosterswerecreatedforeachhouse-hold-theCensusDayrosterandtheInterviewDayroster.TheA.C.E.usedinmoverstoestimatethenumberof P-samplemoversinthepost-stratum,whileusingoutmov-erstoestimatethematchrateofthemovers.ThismethodisreferredtoasMoverProcedureCorPES-Cintheresearchstudies.SeeChapter4fordescriptionsofthe termsnonmover,inmover,andoutmover.AllinmoversandallnonmoversweregenerallyassumedtobeA.C.E.InterviewDayresidents,withtheexceptionof infantsbornafterCensusDay.Peoplelivingingroupquar-ters,suchascollegestudentsindormitories,werenoteli-gibleforthePsample.Therefore,forthepurposeofesti-matingthenumberofinmovers,personinmoversaged18to22whowerelivingingroupquartersonCensusDaywerenotconsideredtobeInterviewDayresidents.NoninterviewadjustmentwasperformedonlyonthePsample.Theprocedurewassimilartothatusedinthe CensusDressRehearsal.Duetothemoverproceduredescribedabove,thereweretwononinterviewadjust-ments-onebasedonhousingunitstatusasofCensus Day(i.e.,theCensusDayroster),andtheotherbasedon housingunitstatusasofthedayoftheA.C.E.interview(i.e.,theInterviewDayroster).Anoccupiedhousingunitwasdefinedasaninterview(forthegivenreferenceday-CensusDayorInterviewDay)iftherewasatleastoneper-son(withanameandatleasttwodemographiccharacter-istics)whopossiblyordefinitelywasaresidentofthe housingunitonthegivenreferenceday.Anoccupiedhousingunit(asofthegivenreferenceday)thatwasnotaninterviewwasanoninterview.Thusaunitthatwas vacant,removedfromthelistofeligiblehousingunits (because,forexample,itwasdemolishedorusedonlyasabusiness),orincertainspecialplaceswasnotconsid-eredanintervieworanoninterview.Inthelattertwositua-tions,theunitwasdeletedfromthelistofA.C.E.samplehousingunits.IfahousingunitwasfoundtobevacantonCensusDayordeletedfromthesample,thenthathouseholddidnotfac-torintotheCensusDaynoninterviewadjustment.The sameconceptappliestoInterviewDay.Thus,vacantanddeletedunitsdidnotcontributetowarddualsystemesti-mation.Anexampleofanillustrativeblockcluster,pro-videdinFigure6-1,page6-10,showshowthestatusofa housingunitonCensusDayandInterviewDaywouldbe determined.ResultsoftheA.C.E.interviewingoperation areshowninTable6-1.Table6-1.StatusofHouseholdInterviewsintheA.C.E.[Unweighted]CensusDay A.C.E.InterviewDayNumberPercentNumberPercentTotalhousingunits | |||
.........300,913100.0300,913100.0 Interviews | |||
...............254,17584.5264,10387.8 Noninterviews | |||
............7,7942.63,0521.0Vacantunits | |||
.............28,4729.529,6629.9Deletedunits | |||
............10,4723.54,0961.4Noninterviewrate | |||
...........3.0%1.1%Note:Percentagesintablemaynotaddtototalduetorounding.Ofthe261,969housingunitsoccupiedonCensusDay,7,794(3.0percent)werenoninterviews.Thecorrespond-ingnumbersforInterviewDaywere267,155and3,052 (1.1percent).ThenoninterviewratewashigherforCen-susDaythanInterviewDay,becauseinterviewstatuswasdeterminedbyresultsobtainedonInterviewDay.Onthat date,informationwassoughtforbothCensusDayand InterviewDay.Anytimeahouseholdmemberorknowl-edgeableproxycouldbereached,aninterviewforInter-viewDaywasgenerallyobtained.CensusDaydatawas notalwaysobtainablefromthesamerespondent,usuallyincaseswhenthehousingunitsoccupantshadmovedinafterCensusDay.Eachofthetwononinterviewadjust-mentsgenerallyspreadtheweightsofnoninterviewedunitsoverinterviewedunitsinthesamenoninterviewcell,definedasthesampleblockclustercrossedwiththetype ofbasicaddress.Forpurposesofthisadjustment,the typesofbasicaddressweresingle-family,multiunit(suchasapartments),andallothers.TheCensusDaynoninter-viewadjustment,determinedaccordingtothestatusof housingunitsasofCensusDay,wasusedtoadjustthepersonweightsofnonmoversandoutmovers.Similarly,theInterviewDaynoninterviewadjustment,determined accordingtothestatusofhousingunitsasofInterviewDay,wasusedtoadjustthepersonweightsofinmovers.Theformulaearedescribedasfollows:Foragivenblockclusterandtypeofbasicaddress,theCensusDaynoninterviewadjustmentfactorwascom-putedas f*cw iC ensus D ay interviewsw i C ensus D ay noninterviewsw i C ensus D ay interviews6-2SectionIChapter6MissingDataProceduresU.S.CensusBureau,Census2000 wherew irepresentstheweightofhousinguniti,thatis,theinverseofitsprobabilityofselectionintotheA.C.E. | |||
sample.Whencomputingthenoninterviewadjustment factor,theweightw iincorporatedthetrimmingthatoccurredinsomeblockclusters.(SeeAppendixC.)How-ever,theweightsdidnotreflectthesamplingfortargeted extendedsearch(TES,Chapter5)fortworeasons.First,thenoninterviewadjustmentwasdoneatthehousingunitlevel,butahousingunitcouldcontainsomepeoplewith TESstatusandotherswithoutit.Second,TESstatuswasnotdetermineduntilafterthematchingoperation,butinformationwasusuallynotcollectedaboutpeopleinnon-interviewedunits,andthesepeopleweregenerallynotsenttobematched.Therefore,therewasnotareasonablewayofsystematicallyclassifyingnoninterviewsintothose withandwithoutTESstatus.Similarly,foragivenclusterandtypeofbasicaddress,theInterviewDaynoninterviewadjustmentfactorwascom-putedas f*iw iI nterview D ay interviewsw i I nterview D ay noninterviewsw i I nterview D ay interviewsTheexampleinFigure6-1onpage6-10,demonstratesthecalculationofthenoninterviewadjustment.Whenthe unweightednumberofnoninterviewedunitsinagivennoninterviewcell(sampleblockclusterbytypeofbasicaddresscategory)wasmorethantwicetheunweighted numberofinterviewedunits,thentheweightsofthenon-interviewedunitswerespreadovertheinterviewedunitsinabroadercell.ThiscellwasformedbycombiningthesampleblockclustersinthesameA.C.E.samplingstratum withinthesametypeofbasicaddress.Becausethenonin-terviewratesweresosmall,thenoninterviewadjustmentfactorswerecloseto1formosthousingunitsinthe sample.ForCensusDay,thefactorsweresmallerthan 1.10formorethan92percentoftheunits;forInterviewDay,thefactorswerelessthan1.10forover98percentoftheunits.Characteristic(Item)Imputation(PersonLevel)ProductionofA.C.E.undercountestimatesrequireddataonage,sex,tenure(ownerversusnonowner),race,and Hispanicorigintoclassifyrespondentsbytheseimportant demographiccharacteristics,sotheyhadtobeimputed wheneverthedatawerenotcollected.Characteristicimpu-tationwasnotcarriedoutforothermissingvariables (withtheexceptionoftheitemswithunresolvedstatus). | |||
Severalvariablesalsousedtoassignpost-strata,suchas thelocationorreturnrateofthecensustract,werethe sameforeveryoneintheblock.Theextentofthemissing characteristicsisportrayedinTable6-2.TheimputationratesintheEsampleforthefivecharacter-isticslistedaboverangedfrom0.3percentforsexupto 3.8percentfortenure(usingunweightedfrequencies).SincetheA.C.E.recordforeachpersonintheEsamplewasmatchedtotheCensus2000editedfileandthefive characteristicswereextractedandcopied,thefollowingimputationproceduresapplyonlytothePsample.P-samplecharacteristicimputationfortheA.C.E.wassimi-lartothatforthe1990PESandthevariousCensus2000tests,includingtheDressRehearsal.Ageandsexwere imputedbasedontheavailabledemographicdistributions determinedfromthePsample.Tenurewasimputedusingaformofnearest-neighborhot-deckprocedure.ToimputeforraceandHispanicorigin,thetwoapproacheswere | |||
combined.Formissingtenure,race,andHispanicorigin,ahot-deckprocedurewasusedtotakeadvantageofthecorrelationsoftenfoundinthesecharacteristicsamongpeoplelivinginthesameblockcluster(or,generally,ingeographicprox-imity).Thecharacteristicsageandsexaregeographicallylessclusteredthantenure,race,andHispanicorigin.Fur-ther,thevalueofageorsexisoftenconsiderablyaffected byspecificconditions,suchasthepersonsrelationshipto thereferenceperson,orwhetherinformationisavailableonthepersonsspouse.Thus,nationaldistributionscondi-tionedonrelevantcovariateswereusedtoimputeforage andsex.Thesedistributionswereconstructedbeforetheimputationbegan,withoutregardtotheimputationforothermissingcharacteristics.Table6-2.PercentofCharacteristicImputationinthePandE Samples[Unweighted]Total peoplePercentofpeoplewithimputed characteristicPercentofpeoplewithoneormore imputed characteristicsAgeSexTenureRace His-panic originPsample..............706,2452.51.71.91.42.45.5Esample..............704,6023.10.33.83.53.611.2SectionIChapter66-3MissingDataProceduresU.S.CensusBureau,Census2000 Age.Thevalueofagewasmissingfor2.5percent(unweighted)ofthePsample.Whenagewasmissing,one offouragecategories(0-17,18-29,30-49and50or older)-ratherthananumber-wasimputed,because onlythecategorywasusedtoassignpeopletoapost-stratumforestimation.Inone-personhouseholds,missing agewasimputedfromthedistributionofagesreportedin suchhouseholds.Inmultipersonhouseholds,iftherela-tionshiptothereferencepersonwasmissing,thedistribu-tionofages(excludingthoseofreferencepersons)inall multipersonhouseholdswasused.Otherwise,iftheper-sonwasthespouse,child,sibling,orparentoftherefer-enceperson,missingagewasgenerallyimputedfroma distributionofreportedagesusingtherelationshiptothe referencepersonandtheageofthereferenceperson.For referencepersons,otherrelatives,andnonrelatives,age wasimputedfromthedistributionofagesreportedby personswiththesamerelationship.SeeFigure6-2,on page6-11,fordetails. | |||
Sex.Theimputationrateforsexwas1.7percentinthePsample.Forone-personhouseholds,sexwasimputedfromthedistributionofsexinallone-personhouseholds. | |||
Toimputethesexofareferenceperson,ifthehouseholdhadmorethanonepersonbutnospousewaspresent,thedistributionofsexforreferencepersonsofmultiperson householdswithnospousepresentwasused.Ifaspousewaspresent,themissingsexofthereferencepersonorthereferencepersonsspousewasimputedasthesex oppositetothatofthespouse.Ifsexwasmissingforthe referencepersonandthespouse,thenthesexoftherefer-encepersonwasimputedfromthedistributionofsexforreferencepersonswithaspousepresent.Thespousewas thenassignedthesexoppositetothatofthereference person.Forotherpersonsinmultipersonhouseholds(thatis,otherthanreferencepersonsandspouses):1)iftherela-tionshiptothereferencepersonwasmissing,andifno oneelseinthehouseholdwasrecordedasaspouseofthereferenceperson,sexwasimputedfromthedistributionofsexforpersons(excludingreferencepersons)fromallmultipersonhouseholds;2)otherwise,sexwasimputed fromthedistributionofsexforpersons(excludingrefer-encepersons,spouses,andpersonswithmissingrelation-ship)fromallmultipersonhouseholds.Figure6-3,onpage6-12,illustratestheprocedure.Tenure.Householdtenure(ownerversusnonowner)wasmissingfor1.9percentofthepeopleinthePsample.Ten-urewasimputedfromtheprevioushouseholdthathadthesametypeofbasicaddressandhadtenurerecorded.Aswiththeadjustmentfornoninterviews,threetypesof basicaddresswereused:single-family,multiunit,andall othertypesofunits.SeeFigure6-4,onpage6-13,forfur-therinformation. | |||
Race.Whenracewasmissing-1.4percentofthePsample-theimputedracecouldbeanyofthe63possible combinationsofthesixbasicracecategories:White,Black,AmericanIndianorAlaskanNative,Asian,NativeHawaiianorOtherPacificIslander,andSomeOtherRace. | |||
All63categoriesweretreatedthesameintheimputation.Thatis,therewerenospecialproceduresforanycatego-riesorgroupsofcategories.Wheneverpossible,missingracewasimputedfromthesamehousehold.Independentlyforeachhouseholdmem-berwithmissingrace,onepersonwasselectedatrandomfromthosehouseholdmemberswithreportedraceandtheselectedpersonsracewasimputedtothegiven householdmember.Ifracewasmissingforallhouseholdmembersbutsomeonehadreportedorigin(Hispanicornon-Hispanic),thentheracedistributionofthenearest previoushouseholdwithanyreportedraceandthesame originwasused.NotethattheHispanicoriginofthehouseholdwasthatofthefirstpersononthehouseholdrosterwithoriginreported.WhenraceandHispanicorigin weremissingforthewholehousehold,theracedistribu-tionofthenearestprevioushouseholdwithreportedracewasusedregardlessofHispanicorigin.SeeFigure6-5, onpage6-14,fordetails.HispanicOrigin.Avalueoforigin-Hispanicornon-Hispanic-wasimputedfor2.4percent(unweighted)of thePsample.Theprocedurewasanalogoustothatfor imputingmissingrace.Thatis,wheneverpossible,originwasimputedfromwithinthesamehousehold.Ifeveryoneinthehouseholdwasmissingorigin,thenthenearestpre-vioushouseholdwithreportedoriginandthesameracecategorywasused.WhenbothHispanicoriginandraceweremissingforthewholehousehold,theHispanicorigin distributionofthenearestprevioushouseholdwithreportedHispanicoriginwasused-regardlessofrace.Fortheimputationprocedureandtheracecategoriesusedin it,seeFigure6-6,onpage6-15.Foreachofthefivecharacteristicsdiscussed,thedistribu-tionofimputedvaluesdidnotnecessarilymirrorthedis-tributionofreportedvalues-norwasthisexpected.How-ever,becausetheimputationrateswerelowinthePandEsamples,thedistributionsbeforeandafterimputation wereverysimilar.Seethedistributionofcharacteristicsonthefollowingpage.6-4SectionIChapter6MissingDataProceduresU.S.CensusBureau,Census2000 DistributionofCharacteristicsBeforeandAfterImputation[Weighted]PsampleEsample BeforeimputesImputed After imputes BeforeimputesImputed After imputesRace1.4%Imputed3.2%ImputedWhiteOnly73.5%67.5%73.4%76.9%57.2%76.2%BlackOnly11.0%10.2%11.0%11.8%6.6%11.6% | |||
AIANOnly0.6%0.7%0.6%0.8%0.8%0.8% | |||
AsianOnly3.5%3.4%3.5%3.7%2.9%3.7% | |||
NHPIonly0.1%0.3%0.1%0.1%0.3%0.1% | |||
Someotherraceonly8.3%14.4%8.4%4.5%28.5%5.3% | |||
Multipleraces3.0%3.5%3.0%2.3%3.7%2.3%Hispanicorigin2.3%Imputed3.4%ImputedHispanic12.4%11.5%12.4%12.5%9.0%12.4%Age2.4%Imputed2.9%Imputed0-1726.1%21.7%26.0%25.9%19.7%25.7%18-2916.7%18.9%16.7%15.5%19.0%15.6% | |||
30-4930.7%33.0%30.8%31.0%30.9%31.0% | |||
50+26.5%26.4%26.5%27.6%30.5%27.6%Sex1.7%Imputed0.2%ImputedMale48.4%47.2%48.3%48.8%53.9%48.8%Female51.6%52.8%51.7%51.2%46.1%51.2%Tenure1.9%Imputed3.6%ImputedOwner68.4%70.3%68.4%69.9%65.1%69.7%Nonowner31.6%29.7%31.6%30.1%34.9%30.3%AssigningProbabilitiesforUnresolvedCases(PersonLevel)Afterallfollow-upactivitieswerecompleted,thereremainedasmallfractionoftheA.C.E.samplewithout enoughinformationtocomputethecomponentsofthedualsystemestimatorgiveninChapter7.Theirstatus wassaidtobeunresolved.Aprocedurecalledimputa-tioncellestimationwasusedtoassignprobabilitiesforP-samplepeoplewithunresolvedmatchorCensusDay residencestatus,andforE-samplepeoplewithunresolved enumerationstatus.AllP-andE-samplepersons-resolvedandunresolved-wereplacedintogroupscalledimputationcellsbasedonoperationalanddemographiccharacteristics.DifferentvariableswereusedtodefinecellsforP-samplematch andresidencestatusandintheE-sampleforenumeration status.Withineachimputationcelltheweightedaverageof1sand0s(representing,e.g.,matchandnonmatch,respectively)amongtheresolvedcaseswascalculated, andthataveragewasimputedforallunresolvedpersonsinthecell.Oneshouldnotethatthenoninterviewadjustmentfactorwasnotincorporatedintothepersonweightswhentheseaverageswerecalculated.Thisisbecausethenoninter-viewadjustmentwasdesignedtospreadtheweightof noninterviewedhousingunitsoverinterviewedhousingunits.However,allpersonswithresolvedresidencestatusinnoninterviewedunitswerenonresidents(since,bydefi-nition,ifonepersoninthehouseholdwasaresidentthen thehouseholdwasconsideredaninterview).Therefore, usingthenoninterviewfactortocalculatetheaveragesfor unresolvedcaseswouldhaveproducedabiasedestimate ofresidenceprobability.Theissueofwhichweightstouse wasmootwhenresolvingE-samplecaseswithmissing enumerationstatus,asanoninterviewadjustmentwasnot appliedtoE-samplepersons.Thus,theweights,w i,usedhereincorporatedallstagesofsampling,includingtheselectionofpeoplefortargeted extendedsearch,butwerenotadjustedforhousehold noninterviews.Anytrimmingoftheweightswasalsoper-formedbeforetheseweightedaverageswerecalculated.UnresolvedResidenceStatusinthePSampleAfterfollow-upwascompleted,allpersonsinthePsamplewhowereeligibletobematchedtotheCensus(seeChap-ter4)wereclassifiedintothreetypes,accordingtotheir statusasaresidentintheirsampledblockatthetimeof thecensus:CensusDayresidents,CensusDaynonresi-dents,andunresolvedpersons-thoseforwhomthere wasnotenoughinformationtodeterminetheresidence status.TheresultsaredisplayedinTable6-3.SectionIChapter66-5MissingDataProceduresU.S.CensusBureau,Census2000 Table6-3.FinalResidenceStatusforthePSamplebyMoverStatus | |||
[Unweighted]Total peopleFinalresidencestatus Residenceratefor resolved cases Confirmed resident Confirmed nonresident Unresolved residentU.S.total..........................653,33795.8%1.9%2.3%98.1%Moverstatus Nonmover.......................627,99296.6%1.7%1.7%98.3% | |||
Outmover........................25,34575.2%7.5%17.4%91.0%Becauseoftheuncertaintyoftheactualstatusofthe15,082people(2.3percentof653,337)withunresolvedresidencestatus,aprobabilityofbeingaCensusDay residentwasassigned(seeequation(6.2)).Then,whencomputingthedualsystemestimate,allperson nonmoversandoutmoverswereincludedwiththeiresti-mationweight(seeChapter7)andthefollowingresidenceprobability: | |||
(6.1)P r res , j{1,ifpersonjisaresidentonCensusDay 0,ifpersonjisNOTaresidentonCensusDay P r*res , j ,ifpersonjisunresolvedToassignPr*res,jforunresolvedcases,theCensusDayresidenceprobabilityforinmoverswasirrelevantforesti-mationandwasnotused.OnlynonmoversandoutmoversinthePsamplewhohadaresolvedfinalresidencestatusandwentthroughthepersonmatchingoperation(for-mally,thosewithafinalmatch-codestatus)wereused. | |||
TheywereplacedintoanumberofimputationcellsasdefinedinTable6-4.Withineachcell,amongtheresolvedcases(thosewithPrres,j=1or0)theweightedproportionofCensusDayresidents,thatis,theweightedaverageof1sand0s,wascomputed: | |||
P r*res , jw i P r res , j resolved personsw i resolved persons (6.2)wherew iwasdefinedatthebeginningofthissection.ThisproportionwasthenassignedasPr*res,jtoeachunre-solvedcaseinthecell.(Theexceptionisforfollow-up matchcodegroup7;thisisexplainedbelow.)Thecells usedtoresolveresidencestatus,alongwiththeprobabili-tiesassignedtotheunresolvedcases,aregiveninTable6-4.Matchcodegroups1through7,whichpartitionthepopu-lationintomutuallyexclusiveandexhaustivegroups,weredeterminedfromthematchcodesandothervari-ablesderivedbeforethefollow-upoperationasexplainedinChapter4.Group8wasformeddifferently.Someinfor-mationfromthefollow-upoperationwascodedintimeforTable6-4.ImputationCellsandProbabilitiesAssignedforResolvingResidenceStatusinthePSampleMatchcodegroupOwnerNonownerNon-HispanicWhiteOthersNon-HispanicWhiteOthers1=Matchesneedingfollow-up | |||
...................0.9820.9860.9930.9912=Possiblematches | |||
............................0.9730.9680.9660.9723=Partialhouseholdnonmatchesneeding follow-up................................... | |||
V3a*0.755 V3b*0.956 V3a*0.901 V3b*0.971 V3a*0.883 V3b*0.959 V3a*0.928 V3b*0.9694=Wholehouseholdnonmatchesneedingfollow-up,notconflictinghouseholds | |||
..........0.9200.9430.9110.9145=Nonmatchesfromconflictinghousehold | |||
........0.9100.9270.9450.9546=Resolvedbeforefollow-up | |||
....................0.9930.9900.9900.9887=Insufficientinformationformatching(Weightedcolumnaverageofgroups1-5and8) | |||
..........0.8130.8670.8440.8728=Potentiallyfictitiousorsaidtobelivingelse-whereonCensusDay | |||
.......................0.1190.1230.1770.157*V3a=Group3Personsage18-29listedaschildofreferenceperson;V3b=Allothergroup3persons.6-6SectionIChapter6MissingDataProceduresU.S.CensusBureau,Census2000 theA.C.E.missingdataprocedures.(Undertheoriginalschedule,thisinformationwouldhavebecomeavailable toolatetobeofuse.)Afterthefollow-upoperation,a smallnumberofpeopleinthePsamplewerecodedas beingpotentiallyfictitiousorsaidtobelivingelsewhere onCensusDay.SuchpeoplewereplacedinGroup8,even thoughtheyalsoqualifiedforoneoftheGroupsl1 through7.Thetwotenurecategorieswereownersandnonowners.Personswereplacedintooneoftworacecategories:non-HispanicWhiteandallothers.Peopleofmultipleraces(for example,apersonrespondingasWhiteandAsian)wereplacedinthelattergroup.V3wasavariabledefinedonlyformatchcodegroup3,partialhouseholdnonmatches. | |||
V3acomprisedpersonsingroup3whowere18to29yearsofageandwerelistedontheA.C.E.householdros-terasachildofthereferenceperson.V3bincludedall otherpersonsingroup3.TheresidenceprobabilityforunresolvedP-samplepersonswascomputedasdescribedabove,exceptforthosein matchcodegroup7-peoplewithinsufficientinformationformatching.Withinthissetoffourcells(seeTable6-4),therewerealmostnoresolvedcasesfromwhichtoextract aprobabilityofbeingaCensusDayresident.Becauseof thelackofinformation-mostofthesecasesdidnotevenhaveavalidname-thesepeopledidnotgothroughthematchingoperationandwerenotsenttofollow-up.To adjustforthesecases,aweightedproportionofCensusDayresidents(1sand0s)wascomputedamongtheresolvedcasesineachofthefourcolumnsofTable6-4 usingmatchcodegroups1through5and8.Separately foreachofthefourtenurerace/ethnicityclasses,theover-allweightedprobabilityofbeingaresidentamongthosesenttofollow-up(groups1through5and8)wasassigned tothosewithinsufficientinformationformatching(group7).Leftoutofthiscomputationwerethosepeoplewhowereresolvedbeforefollow-up(group6).Observations fromtheCensus2000DressRehearsalindicatedthat,intermsoftheirdemographicandoperationalcharacteris-tics,peopleingroup7tendtobemorelikethosein groups1through5and8,thanlikethoseingroup6.IntheDressRehearsal,onlythreeweightedratioswerecalculatedforresidenceprobability:aratioforpersons senttofollow-up,aratioforpersonsnotneedingfollow-up,andanoverallratiousedforpersonswithinsufficient informationformatching.BasedonDressRehearsal results,KearneyandIkeda(1999)suggestcalculating separateratiosbymatchcodegroupandsplittingpersons fromconflictinghouseholdsintoaseparatematchcode group.ThelargerAccuracyandCoverageEvaluation samplesizeinCensus2000thanintheDressRehearsal madeitpossibletoseparatematchesneedingfollow-upfrompossiblematches.Additionalresearchanddiscussionsuggestedaddingadditionalvariableswithinmatchcode group.UnresolvedMatchStatusComputingthedualsystemestimatorrequiredmeasuringthetotalnumberofP-samplepeoplewhowerematchedtopersonsincludedinthecensus.(Separateestimateswereobtainedfornonmoversandoutmovers,butthatdoesnot affectwhatfollows.)Afterfollow-upactivitieswerecom-pleted,eachconfirmedorpossible(unresolved)CensusDayresidentinthePsamplewasdeterminedtobeamatch,anonmatch,orunresolved(thatis,personsfor whommatchstatuscouldnotbedetermined).Matchsta-tusofconfirmedCensusDaynonresidentswasnotusedintheestimation.AsisseeninTable6-5,unresolved matcheswereinfrequentinthePsample.Thetreatmentofunresolvedmatcheswassimilartothatforunresolvedresidencestatus.Foreachconfirmedor possibleCensusDayresidentjinthePsample,thevalue Prm,jwasassignedas1,0,orPr*m,j,inamanneranalo-goustoequation(6.1),accordingtowhetherthepersonwasamatch,anonmatch,orhadunresolvedmatchsta-tus,respectively.Unresolvedmatchesaccountedfor7,826of640,945peopleinthePsample,or1.2percent.Pr*m,jwasassignedusingimputationcellestimationbasedonthosewitharesolvedmatchstatus.Theformulaisthesameasinequation(6.2),butpertainstomatchstatus,thatis,usesthevaluesofPrm,j.Table6-5.FinalMatchStatusforthePSamplebyMoverStatus | |||
[Unweighted]Psample(confirmedorpossibleresidents)Numberof personsFinalmatchstatusMatchrate for resolved casesMatchNonmatch Unresolved matchU.S.total.........................................640,94590.3%8.5%1.2%91.4%Moverstatus | |||
...................................... | |||
Nonmover......................................617,49091.1%8.0%0.9%91.9% | |||
Outmover......................................23,45567.8%21.7%10.5%75.8%SectionIChapter66-7MissingDataProceduresU.S.CensusBureau,Census2000 Aswithresidencestatus,thecaseswerefirstclassifiedaccordingtoseveralcharacteristics.Withincells,the weightedproportionofmatchesamongtheresolvedcases | |||
-excludingallconfirmedCensusDaynonresidents-was computedandassignedtoeachoftheunresolvedcasesin thesamecell.Again,theweights,w i,aredefinedearlier.Thecharacteristicsusedtodefinetheimputationcellsformatchstatus-differentfromthoseusedforresidencesta-tus-areshowninTable6-6.Theywerebasedonobserva-tionsfromtheCensus2000DressRehearsalandananaly-sisoftheA.C.E.operations.KearneyandIkeda(1999) showedthatmoverstatus(nonmoverversusoutmover)discriminatedwellbetweenmatchesandnonmatchesamongtheresolvedcases.Thehousingunitaddress matchcodereferstotheinitialmatchbetweenhousingunitsontheindependent(A.C.E.)listingandthecensusaddresslist;conflictinghousingunitsweredetermined duringA.C.E.personmatchingactivities.Peoplewithatleastoneimputeddemographicvariable(i.e.,age,sex,race,Hispanicorigin,ortenure)were groupedtogetherforimputationofmatchstatus.Unpub-lishedstudiesindicatethat,atleastintheDressRehearsal,thepresenceoftheseimputedcharacteristicsamong resolvedcasesisnegativelyassociatedwiththepropen-sitytobeamatch.Foroutmoversfromaunitthatwasanonmatchoraconflictinghousehold,peoplewerenotseparatedaccordingtotheirimputedcharacteristics.The reasonwastomaintainareasonablenumberofresolvedcasesineachcellfromwhichtoestimatetheweightedproportionofmatches.Theprobabilitiesassignedto peoplewithunresolvedmatchstatusareprovidedinTable6-6.Itisusefultonotethatmostpersonswithunresolvedmatchstatus(7,693ofthe7,826)hadinsuffi-cientinformationformatching;mostofthemdidnothave avalidname,andtheirrateofmissingcharacteristicswas muchhigherthantheaverage.Further,almostallofthese people(7,506)wereinmatchcodegroup7.Assuch,they didnotgothroughthematchingprocess,norwerethey sentforfollow-up.Thisinformationwasconsideredwhen cellswereselectedforimputationofmatchstatus.Vari-ablessuchasageandethnicity-thathadahighchance ofbeingimputedandmightbeofquestionablequality-wereavoided.IntheDressRehearsal,withineachofthefourgeographicsites,oneoverallweightedratioformatchprobabilitywascalculatedandused.KearneyandIkeda(1999)suggestthatseparateratiosforoutmoversandnonmoversshould becalculated.UnresolvedEnumerationStatus(ESample)ThedualsystemestimatoralsorequiredthetotalnumberofcorrectenumerationsintheEsample.Aswithopera-tionspreviouslydiscussed,follow-upactivitieslefteach personintheEsamplewithoneofthreetypesofenu-merationstatus:correct,erroneous,orunresolved.Thepersonwasassignedanumber,Prce,j,equalto1,0,or Pr*ce,j,respectively,accordingtothatstatus,similartoequation(6.1).Table6-7showsthedistributionofpersonsaccordingtoenumerationstatus.ThevaluesofPr* | |||
ce,j forthe21,148unresolvedE-samplepeople(3.0percentof704,602)weredeterminedthroughimputationcell | |||
estimation.Table6-6.ImputationCellsandProbabilitiesAssignedforResolvingMatchStatusinthePSampleMoverstatusHousingUnitAddressMatchCodeHousingunitwasamatch(code1)Housingunitwasanonmatchorthehouseholdisconflicting(code2or4)Noimputes1ormoreimputesNoimputes1ormoreimputesNonmover0.9450.9010.6900.567Outmover0.7980.7910.516Table6-7.FinalEnumerationStatusfortheESample | |||
[Unweighted]EsampleNumberof personsFinalenumerationstatus Correct enumerationratefor resolved cases Correct enumera-tion Erroneous enumera-tion Unresolved enumera-tionU.S.total..........................704,60292.6%4.4%3.0%95.5%6-8SectionIChapter6MissingDataProceduresU.S.CensusBureau,Census2000 TheresolvedandunresolvedcaseswereplacedinthecellsdefinedshowninTable6-8.Withineachcell,the weightedproportionofcorrectenumerationsamong resolvedcaseswascomputedbeforeaccountingfor duplicationwithnon-E-samplepeople,analogoustoequa-tion(6.2),andthenassignedtoeachunresolvedcasein thecell.AswithresidencestatusforP-samplepeople,akeyfactorindeterminingenumerationstatuswastheE-sampleper-sonsmatchcode.ThesecodescanbefoundinChapter4.Peoplewereplacedinmatchcodegroupsaccordinglyin thefollowingsequence:1)Peoplecodedaspotentially fictitiousorsaidtobelivingelsewhereonCensusDay(basedoninformationcollectedduringthefollow-upoperation)wereplacedingroups11and12,respectively. | |||
2)Allotherpeopleincludedintheoperationfortargetedextendedsearchwereplacedingroup10.SeeChapter5fordetails.3)PeopleintheremainderoftheEsample werethenplacedintheappropriatematchcodegroup,asdefinedinTable6-8.Othercharacteristicsusedtodefinecellswerethepresenceorabsenceofimputedcharacteris-tics(aswasusedtodefinecellsformatchstatus);whether thepersonwasnon-HispanicWhiteoranyotherrace-ethnicitycombination;andV3,asdefinedinthesection onresidencestatus.Therewasanadditionaladjustmentmadetotheenumera-tionprobabilityofE-samplepeopleasaresultofduplica-tionwithpersonssubsampledoutoftheE-sampleinlargeclusters.Ifthesameidentitywasassignedto u E-samplepersonsand vpersonswhoweresubsampledoutoftheEsample,1)oneofthe uE-samplepersonswasselectedduringthepersonmatchingoperation,and2)theinitialcorrectenumerationprobabilitywasmultipliedbyu/(u+v)duringthemissingdataactivities,asitwasnotknownwhichpersonwastheactualE-sampleperson. | |||
Theother u-1E-samplepersonswereassignedacorrectenumerationprobabilityof0.Table6-8.ProbabilitiesAssignedforResolvingEnumerationStatusintheESampleMatchcodegroupNoimputedcharacteristics1ormoreimputedcharacteristics1=Matchesneedingfollow-up0.9770.9772=Possiblematches0.9680.9683=PartialhouseholdnonmatchesV3a*0.871V3b*0.974V3a*0.908V3b*0.9604=Wholehouseholdnonmatcheswherethehousingunitmatched;notconflictinghouseholds Non-Hispanic White 0.965 Others 0.974 0.9585=Nonmatchesfromconflictinghouseholds;forhousingunitsnotinregularnonresponsefollow-up0.9750.9656=Nonmatchesfromconflictinghouseholds;housingunitsinregularnonresponsefollow-up0.9140.9267=Wholehouseholdnonmatches,wherethehousingunitdidnotmatchinhousingunitmatching Non-Hispanic White 0.959 Others 0.947 0.9508=Resolvedbeforefollow-up Non-Hispanic White 0.995 Others 0.990 0.9799=Insufficientinformationformatching0.00010=Targetedextendedsearchpeople0.9280.858 11=Potentiallyfictitiouspeople0.0580.088 12=PeoplesaidtobelivingelsewhereonCensusDay0.2290.210*V3a=Group3Personsage18-29listedaschildofreferenceperson;V3b=Allothergroup3personsSectionIChapter66-9MissingDataProceduresU.S.CensusBureau,Census2000 Figure6-1.AdjustmentforNoninterviews:AnExampleConsiderablockclusterwithninehousingunits,allhavingthesametypeofbasicaddress,forexample,allsingle-familyhomes,asdepictedbelow. | |||
Housing unitWeight Actual situationStatusof(andinformationfrom)A.C.E.InterviewCensusDay interview status A.C.E.InterviewDayinterviewstatus1100Residenton4/1/00andattimeofA.C.E.interviewInterviewedinA.C.E.InterviewInterview2100Residenton4/1/00andattimeofA.C.E.interviewNeighbor(proxy)interviewedinA.C.E.InterviewInterview3100Residenton4/1/00andattimeofA.C.E.interviewNooneinterviewedinA.C.E.NoninterviewNoninterview4100Vacanton4/1/00,residentattimeofA.C.E.interviewInterviewedinA.C.E.,knowsof4/1/00statusVacantInterview5100Vacanton4/1/00,residentattimeofA.C.E.interviewInterviewedinA.C.E.,noknowledgeof4/1/00 | |||
statusNoninterviewInterview6100Vacanton4/1/00,residentattimeofA.C.E.interviewNooneinterviewedinA.C.E.NoninterviewNoninterview7100Residenton4/1/00,vacantattimeofA.C.E.interviewInformationobtainedfrom proxyInterviewVacant8100Residenton4/1/00,vacantattimeofA.C.E.interviewNoinformationon4/1/00status;Censusstaffdeter-minesvacantattimeof | |||
A.C.E.NoninterviewVacant9100Residenton4/1/00,differ-entresidentattimeofA.C.E. | |||
interviewInterviewedinA.C.E.,knowsof4/1/00statusInterview InterviewNote:Inthisnoninterviewcell(sampleblockcluster xtypeofbasicaddress),peopleininterviewedhousingunitswouldhavereceivedthefollow-ingnoninterviewadjustments:a)Tothepersonweightsofnonmoversandoutmovers,CensusDayNoninterviewadjustment=800/400=2.b)Tothepersonweightsofinmovers,A.C.E.InterviewDayNoninterviewadjustment=700/500=1.4.6-10SectionIChapter6MissingDataProceduresU.S.CensusBureau,Census2000 | |||
< | |||
<<< | |||
55 5<5< | |||
Chapter7.DualSystemEstimation INTRODUCTIONDualSystemEstimation(DSE)wasusedtoestimatecover-ageofCensus2000usingdatafromtheAccuracyandCoverageEvaluation(A.C.E.)Survey.DSEwasalsousedby theU.S.CensusBureautoestimatecensuscoverageforthe1980and1990censuses,andtoevaluatecoveragepriorto1980.TheuseofDSEformeasurementofcover-agein1980isdescribedinFayetal.(1988),whileHogan(1992,1993)describestheuseofDSEin1990.AsdescribedinKillion(1998),severalalternativestoDSE wereconsideredforCensus2000.ThesealternativeswereeithershowntoproduceresultsgrosslyinferiortoDSEorresearchwasnotconclusive.ThischapterprovidesthedetailsofDSEfortheCensus2000A.C.E.TheDSEwascalculatedseparatelyforasetof populationgroupsreferredtoaspost-strata.Thepost-stratificationvariablesandthefinalpost-stratificationplanarediscussedindetail.Inaddition,thevarianceestimation methodologyusedineachpost-stratumissummarizedandsomebasicresultsaregiven.DUALSYSTEMESTIMATIONThissectioncontainsthedetailsoftheDSEcalculatedwithineachfinalpost-stratum.ItdescribesthebasicDSE model,includingadiscussionoftheadvantageofpost-stratification.ThedetailsoftheDSEcomputedwithineachfinalpost-stratumforCensus2000arepresented.Allcom-ponentsoftheDSEaredefined.TheDSEaccountedfor specialhandlingofmissingdata,searchareasformatch-ing,andmovers.Missingdataandsearchareasformatch-ingarecoveredindetailinChapters6and5,respectively. | |||
ThemethodusedtohandlespecialproblemscausedbymoversinCensus2000DSEisalsodiscussed.Theattach-mentprovidesdetailedbackgroundonoptionsfordealing withmoversincensuscoveragemeasurementsurveys. | |||
ThesectionconcludeswithashortdiscussionofhowtheDSEresultsserveasinputtosyntheticestimationdowntotheblocklevel.Adetaileddiscussionofsyntheticestima-tionisprovidedinChapter8andHaines(2001).DSEModelTheDSEmodelisdiscussedindetailinWolter(1986)andmoregenerallyinHogan(1992).Thischaptergivesagen-eralpresentation.TheDSEmodel(appliedwithineachpost-stratum)conceptualizeseachpersonashavingaprobabilityofbeingeitherinornotinthecensusenu-meration,aswellaseitherinornotintheA.C.E.Table7-1.DSEModelIncensusOutofcensusTotalInA.C.E.N 11 N 12 N 1+OutofA.C.E.N 21 N 22 N 2+TotalN+1 N+2 N++AllcellsareconceptuallyobservableexceptN 22andanyofthemarginalcellsthatincludeN 22(i.e.,N 2+, N+2, and N++).Themodelassumesindependencebetweenthecen-susandtheA.C.E.Thismeansthattheprobabilityofbeing intheij thcell,p ij,istheproductofthemarginalprobabili-ties,p i+p+j.Theestimateoftotalpopulationinapost-stratumwiththeindependenceassumptionis DSENN1N 1N 11.Theindependenceassumptioncanbeinerror,eitherduetocausaldependencebetweenthecensusenumerationandtheA.C.E.enumeration,orduetoheterogeneityin captureprobabilitieswithinapost-stratum.Causaldepen-denceoccurswhentheeventofanindividualsinclusionorexclusionfromonesystemaffectshisorherprobability ofinclusionintheothersystem.Forexample,somepeoplewhodidanswerthecensusmaynothavecooper-atedwiththeA.C.E.,thinkingtheyhadhelpedenough. | |||
Asanotherexample,apersoncontactedduringA.C.E.list-ingmaynothaverespondedtothecensusthinkingthattheA.C.E.listeralreadyrecordedthem.However,evenifcausalindependenceistrueforallindividuals (p ij=p i+p+j),theindependenceassumptioncanbevio-latedbyheterogeneity.Eitherthecensusinclusionprob-abilitiesp | |||
+1ortheA.C.E.inclusionprobabilitiesp 1+mustbethesameforallindividuals.Thismeansthathomoge-neityinbothsystemsisnotrequired.Forexample,somepeoplemaytrytheirbesttoavoidbeingcountedinboththecensusandA.C.E.,resultinginthesepeoplehaving muchsmallerinclusionprobabilitiesthanotherpeople.Errorintheindependenceassumptionforeitherreasonresultsincorrelationbias.Post-stratification,orgroupingofindividualslikelytohavesimilarinclusionprobabilities,andcalculatingDSEswithinpost-stratawasdonetodecreasecorrelationbias.ResearchwascarriedouttodetermineeffectivevariablesSectionIChapter77-1DualSystemEstimationU.S.CensusBureau,Census2000 fortheA.C.E.post-stratificationdesign.Allvariablesincludedinthe1990PESpost-stratificationwereconsid-eredaswereseveralnewones.Thespecificvariablescon-sideredwererace/Hispanicorigin,age/sex,tenure,house-holdcomposition,relationship,urbanicity,percentowner, returnrate,percentminority,typeofenumerationarea, householdsize,hard-to-countscores,censusdivision, censusregion,andregionalcensuscenter.Fromthese variables,fifteenpost-stratificationoptionsweredevel-opedforempiricalresearch.Foreachpost-stratification option,meansquareerrorsoftotalpopulationestimates andsyntheticestimateswerecomputedatthenational, state,andcongressionaldistrictlevels,aswellasfor selectedcities.Themajorconclusionswereasfollows:*Thedemographicvariablesusedinthe1990PESwereeffective,butdidnotfullycapturethegeographicdiffer-ences,especiallythoseaffectedbythequalityofthe MasterAddressFile.Anurbanicity/typeofenumerationvariableappearedtocapturemuchofthegeographicdifferences.*Thetract-levelreturnratevariablecapturedsomeofthesocioeconomicdifferencesforsyntheticestimatesatlowerlevelsofaggregation.DetailsoftheCensus2000post-stratificationresearchmethodologyaregiveninKostanichetal.(1999)andGrif-fin(1999).ResultsofthisresearcharegiveninGriffinandHaines(2000)andSchindler(2000).Thepost-stratification designchosenforCensus2000isprovidedinthischapter.TheDSEcanbewrittenasfollows: | |||
DSEN1 (N 1N 11)Thatis,thetotalpopulationisestimatedbythenumbercapturedinthecensustimestheratioofthosecapturedintheA.C.E.surveytothosecapturedinbothsystems.In practice,thecomponentsoftheDSEareestimatedfromasamplesurvey.N | |||
+1isnotthecensuscount;thecensuscount(C)mustbecorrectedforerroneousenumerations, aswellasforpersonsenumeratedinthecensuswith insufficientinformationtomatchtotheA.C.E.enumera-tion.Toactuallyestimatethenumberofpeoplecorrectlyenumeratedinthecensus,asampleofalldata-defined personsisselected.Thissampleofdata-definedcensuspersonsiscalledtheenumerationorEsample.Toestimatetheratioofthosecapturedinbothsystemstothosecap-turedinA.C.E.,thepopulationorPsampleisused.ThePsampleconsistsofpersonsinterviewedduringA.C.E.enu-meration.TheformoftheDSEusedincensuscoveragemeasure-mentsurveyssuchasA.C.E.isasfollows: | |||
DSEDDCE N eN p Mwhere DD=thenumberofcensusdata-definedpersonseligibleandavailableforA.C.E.matching, CEtheestimatednumberofcorrectenumerationsfromtheEsample, N etheestimatednumberofpeoplefromtheEsample, N ptheestimatedtotalpopulationfromthePsample, MtheestimatednumberofpersonsfromtheP-samplepopulationwhomatchtothecensus.Note:PersonsinGroupQuartersareexcludedfromalltheabovecountsforA.C.E.,aswerepersonsinhousingunits whowereaddedtothecensusafterEsampleIdentifica-tion(lateadds). | |||
DefinitionsBlockCluster.Agroupingofoneormorecensusblocks.BlockclustersaretheprimarysamplingunitsforA.C.E.andaverageabout30housingunitseach.CorrectEnumeration(CE).Acorrectenumerationisapersonwhoisenumeratedinasampleblockclusterdur-ingthecensuswhoisalsodeterminedbyA.C.E.opera-tionstohavelivedinthatblockcluster(orifappropriateasurroundingblock)onCensusDay.Correctenumerations haveacorrectenumerationprobability,Pr ce,j,equalto1foreachpersonj.CorrectEnumerationProbability(Pr ce,j).ThisisdefinedastheprobabilitythatpersonjintheEsample wascorrectlyenumeratedintheA.C.E.(orsurroundingblock)blockcluster.Theprobabilityofcorrectenumera-tionistypically0or1,butitcantakeonvalueswithin thisrangeduetomissingdataimputation.CoverageCorrectionFactor(CCF).Thecoveragecor-rectionfactorforapost-stratumiscalculatedbydividing theDSEforthatpost-stratumbyitscensuscount.A.C.E.syntheticestimatesforanydataitemforanygeographicareaareobtainedbymultiplyingthecoveragecorrectionfactorbythecensuscountwithineachpost-stratum,then summingoverallpost-strata(seeChapter8fordetailson syntheticestimation).Data-DefinedPerson.Thisconceptisdefinedforallcensuspersons.Adata-definedpersonisapersonwho hastwoormoreofthe100-percentdataitemsansweredonthecensusform.Anyitemscanbeselectedfromthe100-percentdataitems,whichincludename,age,sex, race,andHispanicorigin.Relationshiptopersononeis alsoa100-percentdataitemforallpersonsbesidesper-sonone.Personsnotsatisfyingthiscriteriaarereferredtoasnon-data-defined.ESample.TheEsampleistheEnumerationsample.Itconsistsofalldata-definedpersonsintheA.C.E.blockclusterswhowereenumeratedinthecensus.7-2SectionIChapter7DualSystemEstimationU.S.CensusBureau,Census2000 GroupQuarters(GQ)Persons.PersonslivinginGQs,suchascollegedormitories,prisons,ormilitarybarracks. | |||
GQpersonswerenotcoveredintheA.C.E.andare excludedfromtheA.C.E.universe.Inmover.ApersonwhomovedintoaP-samplehousingunitafterCensusDay.InsufficientInformationinCensus(II).ThosepersonsinthecensusforwhomthereisinsufficientinformationforinclusionintheEsample.Verylittledataisavailable forthesepersons.Thiscategoryincludesnon-data-definedpersonsandpersonsinwholehouseholdimputations.Notethatinsufficientinformationincensusisdifferent thaninsufficientinformationformatching.TheformerareexcludedfromtheEsampleandthelatterareincludedintheEsample.LateAdds.LateAddsarepersonsinhousingunitswhowereaddedtothecensusafterE-sampleIdentification. | |||
ThesehousingunitshadanunknownfinalstatusatthetimeofA.C.E.matchingbutweresubsequentlyincludedinthecensus.PersonswhoareLateAddswereineligiblefor matchingand,therefore,notincludedinthecensusDSE component.MatchProbability(Pr m,j).Thisisdefinedastheprob-abilitythatpersonjinthePsamplewasmatchedtoacensuspersoninthesearcharea(orinaTESblock).The matchprobabilityistypically0or1,butitcantakeonval-ueswithinthisrangeduetomissingdataimputation.MoverStatus.EachpersoninthePsamplewasclassifiedasanonmover,outmover,orinmover.Nonmover.AnA.C.E.samplepersonwhosehousingunitonCensusDayandA.C.E.InterviewDayareidentical.Outmover.ApersonwhomovedoutofanA.C.E.housingunitbetweenCensusDayandthedateoftheA.C.E.inter-view.PSample.AlsoknownasthePopulationsample.ThePsampleconsistsofthosepersonsconfirmedtoberesi-dentsofthehousingunitsintheA.C.E.blockclustersas ofCensusDaybytheindependentportionoftheA.C.E.reinterviewandsubsequentoperations.ResidenceProbability(Pr res,j).Theprobabilitythatper-sonjontheP-samplefileisaresidentofthesample householdonCensusDay.AllinmoversareassumedtobeA.C.E.InterviewDayresidents.NonmoversandoutmoverscanbeCensusDaynonresidents,ifinformationindicates theywerenotaresidentofthesamplehouseholdbasedoncensusresidencyrules.Theresidenceprobabilityistypically0or1butitcantakeonvalueswithinthisrange duetomissingdataimputation.TargetedExtendedSearch(TES).A.C.E.operationinwhichblockclustersareidentifiedandselectedfora searchoftheimmediatesurroundingareatofindpersons geographicallymis-locatedinablockneighboringthe A.C.E.blockcluster.Moregenerally,itisthemethodology fortargeting,sampling,andimplementingthesearch operationsinthefield.DSEFormulaTheDSEforanygivenpost-stratumwascalculatedby: | |||
DSEDD (CE N e)[N nN i (M n(M o N o)N i)]Allcountsandestimatesareforaspecificpost-stratumandthesubscriptsn,i,andostandfornonmovers,inmov-ers,andoutmovers,respectively.AdjustmentstothisDSE wereoccasionallymadetoavoidtheunlikelyeventthattheformularesultsindivisionbyzero.Forpost-stratawithlessthanten(unweighted)outmoverpersons,the ratioinsidethesquarebracketswaschangedtothefol-lowing: N nN o M nM o.CoverageCorrectionFactorFormulaThecoveragecorrectionfactor(CCF)isameasureofthenetovercountornetundercountofthehouseholdpopula-tionwithinthecensus.TheCCFforapost-stratumistheratiooftheDSEtothecensuscount: | |||
CCFDSE Cwhere C=thefinalcensushouseholdpopulationcountwhereCDD+II+LA, IIthenumberofcensuspeoplewithinsufficient information, LAthenumberofpeopleadded(late)tothecensusandnotavailableforA.C.E.matching.LateAdds includebothdata-definedandnon-data-definedrecords.Note:ThenumeratoroftheCCFisbasedondata-definedpersons.Thedenominatorincludesdata-definedandnon-data-definedpersonsaswellaslateadds.Thus,weare implicitlyassumingthecoverageoflateaddsandnon-data-definedpersonsisthesameasthatfordata-definedpersons.Forexample,acoveragecorrectionfactorof1.05wouldimplythatforevery100peoplewithinthegiven post-stratum,thenetundercountisfivepersons.DSEComponentsEachcomponentoftheDSEisdescribednext.SectionIChapter77-3DualSystemEstimationU.S.CensusBureau,Census2000 DDisthecensuscount(unweighted)ofdata-definedper-sonsinthepost-stratum.TheestimatednumberofE-samplepersonsiswrittenas: | |||
N eW j*jE samplewhereW j*=inverseoftheprobabilityofselection,includingafactorforTargetedExtendedSearch sampling.Theestimatednumberofcorrectenumerationsiscalcu-latedas: CEP r ce , j W j*jE samplewhere P r ce , j is:1ifpersonjcorrectlyenumerated,0ifpersonjNOTcorrectlyenumerated,or P r*ce , jifpersonjisunresolved,where P r*ce , j isestimatedthroughmissingdataimputation.Note:Probabilitiesforpersonswithunresolvedfinalcor-rectenumerationstatusintheEsampleorunresolved finalresidenceormatchstatusinthePsampleareassignedusingimputationcellestimationwithingroups.SeeChapter6fordetails.Withineachgroup,aprobability equaltoasimpleproportionisimputedforunresolved persons.Forexample,E-sample(orP-sample)personsinagroupwithunresolvedenumeration(match)statuswereassignedacorrectenumeration(match)probabilitythatis theproportionofcorrectenumerations(matches)amongpersonswithresolvedenumeration(match)statusinthegroup.TheprobabilitiesareestimatedintheDSEformulas | |||
as: P r*m , jistheestimatedmatchprobabilityforunresolvedmatchstatus P r*res , jistheestimatedresidenceprobabilityforunresolvedresidencestatus P r*ce , jistheestimatedenumerationprobabilityforunresolvedenumerationstatusSomepersonsmovedbetweenCensusDayandA.C.E.InterviewDay.AmoverisapersonwhoselocationonthedayoftheA.C.E.interviewdiffersfromhisorherlocationonCensusDay.Thetreatmentofmovershasimportant ramificationsforestimation.Theattachmenttothischap-tertitledTheEffectofMoversonDualSystemEstimationprovidesadiscussiononalternativemethodologiesfor handlingmovers.ForCensus2000,moversweretreatedbyaprocedureknownasProcedureC,unlessapost-stratumhadlessthanten(unweighted)outmoverpersons. | |||
Inthiscase,ProcedureAwasimplemented.ProcedureC identifiesallcurrentresidentslivingorstayingatthesampleaddressatthetimeoftheA.C.E.interview(non-moversandinmovers),plusallotherpersonswholivedat thesampleaddressonCensusDaywhohavesincemoved(outmovers).ThePsampleincludesnonmoversandout-movers.Foroutmovers,theinterviewersattempteda proxyinterviewtoobtaindatasuchasname,sex,andage thatwasusedformatching.Thematchrateforinmovers wasestimatedbythematchrateofoutmovers.Incon-trast,thenumberofmoversinthePsampleforA.C.E. | |||
sampleareaswasestimatedbytheinmovers.Notethatno matchingwasdoneforinmovers. | |||
N nistheweightedtotalpopulationfornonmoversforthepost-stratumfromthePsample.Theweightforeachper-sonjistheproductofthreevalues:1.theinverseoftheP-sampleselectionprobabilityincludingafactorfortheTargetedExtendedSearch sampling(W j*),2.anoninterviewadjustmentbasedonCensusDayinter-viewstatus(f* | |||
c,j),and3.aCensusDayresidenceprobability(Prres,j).TheestimatednumberofP-samplenonmoversiscalcu-latedas: N nf*c , j P r res , j W j*jN onmoverswhere, P r res , j is: 1ifpersonjisaresidentonCensusDay, 0ifpersonjisNOTaresidentonCensusDay,or P r*res , jifpersonjisunresolved,where P r*res , j isestimatedthroughmissingdataimputation.Note:PersonswhowerenotresidentsonCensusDayarenotincludedinN nsincePrres,j=0isamultiplicativefac-torineachpersonscontributiontoN n.TheestimatednumberofP-samplenonmovermatchesiswrittenas: | |||
M nP r m , j f*c , j P r res , j W j*jN onmoverswhere, P r m , j is: 1ifpersonjisamatchonCensusDay, 0ifpersonjisNOTamatchonCensusDay,or P r*m , jifpersonjisunresolved,where P r*m , j isestimatedthroughmissingdataimputation N iistheweightedtotalpopulationforinmoversforthepost-stratumfromthePsample.Theweightforeachper-sonjistheproductoftwovalues:1.theinverseoftheP-sampleprobabilityofselection (W j*asdefinedabove),and2.anoninterviewadjustmentfactorbasedonA.C.E.InterviewDaystatus(f* | |||
a,j).TheestimatednumberofP-sampleinmoversisdenoted: | |||
N if*a , j W j*jI nmovers7-4SectionIChapter7DualSystemEstimationU.S.CensusBureau,Census2000 NotethatallinmoversareassumedtobeA.C.E.InterviewDayresidents.TheestimatednumberofP-sampleoutmoversiswritten: | |||
N of*c , j P r res , j W j*jO utmoversTheestimatednumberofP-sampleoutmovermatchesiscalculatedas: | |||
M oP r m , j f*c , j P r res , j W j*jO utmoversSyntheticEstimationTheestimatedcoveragecorrectionfactorsforeachpost-stratumwereusedtoformsyntheticestimates.Synthetic estimationcombinescoverageerrorresultswithcensuscountsattheblockleveltoproduceadjustedblock-levelpopulationestimates.Thesyntheticmethodologyassumes coveragecorrectionfactorsdonotvarywithinapost-stratum.Asaresult,onecoveragecorrectionfactorisassumedtobeappropriateforallgeographicareaswithineachpost-stratum.Toobtainblock-levelsyntheticesti-mates,block-levelcensuscountsforpost-strataaremulti-pliedbypost-stratumcoveragecorrectionfactorsandaggregated.Thereisonecoveragecorrectionfactorfor eachpost-stratum,andeachpersoninablockisinonepost-stratum.Forexample,supposeallpersonsinablockfallintooneofsixpost-strata.Asyntheticestimatefor thisblockisformedbysummingtheproductofcensus countsforthatblockandpost-stratumwithitscorre-spondingcoveragecorrectionfactor.Acontrolledround-ingtechniquewasimplemented,resultinginthecreation ofpersonrecordsattheblocklevel.Subsequenttabula-tions,basedontheoriginalandreplicatedrecords,arecorrectedforcoverageerror.Adetaileddiscussionofsyn-theticestimationisprovidedinChapter8andHaines | |||
(2001).POST-STRATIFICATION BackgroundThegoalofpost-stratificationfordualsystemestimationistoestablishgroupsofpersonswhoareexpectedto havesimilarcoverage.Acommonassumptionisthat peoplewhoaresubjecttosimilarhousing,language,edu-cation,andculturalattitudeswouldalsosharesimilarcen-suscoverage.Hogan(1993)indicatedthattenure,race andethnicorigin,age/sex,anddegreeofurbandevelop-mentwerereasonablemarkersforthesesimilaritiesinthe1990census.Anearliersectionnoted,however,thatthe independenceassumptionoftheDSEmodelcanbeinerrorduetoheterogeneouscaptureprobabilitieswithinapost-stratum.Post-strataareformedtosupportDSEby groupingpersonswithsimilarcensuscoverage,soasto reduceheterogeneityincaptureprobabilitiesforDSEs.Inmanysurveys,post-stratificationisdonetoreducevari-ancesandpartiallycorrectforproblemsinsamplingor undercoverage.ForDSE,theprimaryreasonforpost-stratificationistoreduceheterogeneitybias.Anyvariance reductionorsamplingbiascorrectionassociatedwith post-stratificationisabonus.Infact,theusualtrade-offis thatformingmanypost-stratareducesheterogeneityat theexpenseofaddingvariance.Asthenumberofpost-strataincreases,fewerpeopleinthecoveragemeasure-mentsurveyfallintoeachindividualpost-strata.Thepost-stratificationplanforCensus2000A.C.E.issum-marizedinthissection.Also,thedetaileddefinitionsof thepost-stratificationvariablesandtherace/Hispanicori-gindomainsaregiven.SeeHaines(2001b)forfurtherdetails.The2000A.C.E.differsfromthe1990Post-EnumerationSurvey(PES)inthatithasapproximatelytwicethesamplesizeofthePES.Thislargersamplesizepermittedtheformationofmorepost-stratathathasthe advantageofreducingcorrelationbias,aswellassam-plingvariance.Additionallyin2000,multipleresponsestotheracequestionwerepermitted;in1990onlyonerace couldbeselected.The1990PESpost-stratastartedwithacross-classificationofsevenvariables:age,sex,race,Hispanicorigin,tenure, urbanicity,andregion.Therewere840cellsinthecross-classification.Collapsingwasnecessaryinordertopro-ducepost-stratawithsufficientsampleforreliableDual SystemEstimation(DSE).Thecollapsingreducedthenum-berofpost-stratato357.RaceandHispanicoriginwereconsideredthemostimpor-tantvariablestoretainin1990.Aftercollapsing,fiverace/Hispanicoriginpost-strataweremaintained:Non-HispanicWhiteorOther,Black,HispanicWhiteorOther, AsianandPacificIslander,andReservationIndians.Off-reservationAmericanIndianswereplacedineithertheNon-HispanicWhiteorOthergrouportheHispanicWhite orOthergroup,dependingonwhethertheywereofHis-panicorigin.Withineachoftheserace/Hispanicoriginpost-strata,sevenage/sexcategoriesweremaintained.Theothervariableswerecollapsedinthefollowingorder:region,urbanicity,thentenure,ifnecessary.ForAmericanIndiansresidingonreservations,allthesevariableswere collapsed.ForAsianandPacificIslanders,regionandurba-nicitywerecollapsedandtenuremaintained.FortheBlackandHispanicWhiteorOthergroups,regionwascollapsed fortwolevelsofurbanicity.ForNon-HispanicWhiteorOther,thefullcross-classificationofregion,urbanicityandtenureweremaintained.GriffinandHaines(2000b)pro-videsadetailedtableonthe1990PESpost-stratification.Post-StratificationPlanTheCensus2000A.C.E.retainedmostofthe1990PESpost-stratificationvariablesandincludedseveraladdi-tionalones.Ninevariableswereusedin2000:age,sex,SectionIChapter77-5DualSystemEstimationU.S.CensusBureau,Census2000 race,Hispanicorigin,tenure,region,MetropolitanStatisti-calAreasize/TypeofEnumerationArea,andtract-level returnrate.TheMetropolitanStatisticalAreasizevariable replacedtheurbanicityvariablethatwasnotavailable untilthesummerof2001.TypeofEnumerationArea(TEA) andthetractreturnrateweretwonewfeaturesofthe 2000A.C.E.post-stratification.Themailout/mailbackareasweredifferentiatedfromothertypesofenumerationareas.Inaddition,tractswereclassifiedbyhighorlowreturnrates.Multipleresponsestotheracequestionwere reflectedintheraceandHispanicorigingroupings.Table7-2showsthe64post-stratumgroupsfortheCen-sus2000A.C.E.Withineachpost-stratumgroup,thereare sevenage/sexgroups(showninTable7-3).Thus,therewasamaximumof64 x7=448post-strata.TheP-samplesizewastoosmallorthesamplingvariancetoohighfor eightofthe64post-stratumgroups.Foreachofthese eightgroups,the7age/sexpost-stratawerecollapsed into3post-strata(under18;males18+andfemales18+).Asaresult,directDSEswerecalculatedwithineachof416post-strata,whichwereexpandedto448DSEsusingsyn-theticestimationforthecollapsedgroups.Thepost-stratificationplanwaschosentoreducecorrelationbias withouthavinganadverseeffectonthevarianceofthe dualsystemestimator.Followingisadetaileddescription ofthepost-stratificationvariablesincludinganexplana-tionoftherace/Hispanicorigindomainassignment7-6SectionIChapter7DualSystemEstimationU.S.CensusBureau,Census2000 Table7-2.Census2000A.C.E.64Post-StratumGroups(U.S.)Race/Hispanicorigindomainnumber*TenureMSA/TEAHighreturnrateLowreturnrateNEMWSWNEMWSWDomain7(non-HispanicWhiteor Someotherrace)OwnerLargeMSAMO/MB0102030405060708MediumMSAMO/MB0910111213141516 SmallMSA&Non-MSAMO/MB1718192021222324AllotherTEAs2526272829303132NonownerLargeMSAMO/MB3334MediumMSAMO/MB3536 SmallMSA&Non-MSAMO/MB3738 AllotherTEAs3940Domain4(Non-HispanicBlack)OwnerLargeMSAMO/MB4142MediumMSAMO/MBSmallMSA&Non-MSAMO/MB4344AllotherTEAsNonownerLargeMSAMO/MB4546MediumMSAMO/MBSmallMSA&Non-MSAMO/MB4748AllotherTEAsDomain3 (Hispanic)OwnerLargeMSAMO/MB4950MediumMSAMO/MBSmallMSA&Non-MSAMO/MB5152AllotherTEAsNonownerLargeMSAMO/MB5354MediumMSAMO/MBSmallMSA&Non-MSAMO/MB5556AllotherTEAsDomain5(NativeHawaiianor PacificIslander) | |||
Owner 57 Nonowner 58Domain6(Non-HispanicAsian) | |||
Owner 59 Nonowner 60 AmericanIndianor Alaska NativeDomain1 (On Reservation) | |||
Owner 61 Nonowner 62Domain2(Off Reservation) | |||
Owner 63 Nonowner 64*ForCensus2000,personscanself-identifywithmorethanoneracegroup.Forpost-stratificationpurposes,personsareincludedinasingleRace/HispanicOriginDomain.Thisclassificationdoesnotchangeapersonsactualresponse.Further,allofficialtabulationsarebasedonactual responsestothecensus.SectionIChapter77-7DualSystemEstimationU.S.CensusBureau,Census2000 Table7-3.Census2000A.C.E.Age/SexGroupsMaleFemaleUnder18118to292330to494550+67Post-stratificationVariablesThissectiongivesadetaileddescriptionofthepost-stratificationvariablesincludingthehandlingofmultipleresponsestotheracequestion.A.C.E.post-stratification usedthefollowingvariables:*race/Hispanicorigin-sevencategories*age/sex-sevencategories*tenure-twocategories | |||
*MetropolitanStatisticalArea(MSA)byTypeofEnumera-tion(TEA)-fourcategories*returnrate-twocategories | |||
*region-fourcategoriesThesevenrace/Hispanicorigindomainswere: | |||
*AmericanIndianorAlaskaNativeonReservations*Off-ReservationAmericanIndianorAlaskaNative*Hispanic | |||
*Non-HispanicBlack | |||
*NativeHawaiianorPacificIslander*Non-HispanicAsian*Non-HispanicWhiteorSomeotherrace Inclusioninarace/Hispanicorigindomainiscomplicated,asitdependsonseveralvariablesandwhetherthereare multipleraceresponses.Inaddition,inclusionina race/Hispanicorigindomaindoesnotchangeapersonsrace/Hispanicoriginresponse.AllCensus2000tabula-tionsarebasedontheactualresponses.Forexample,a personwhorespondedasAmericanIndianonareserva-tionandBlackwasplacedinthefirstrace/Hispanicorigincategory(Domain1)forpost-stratificationpurposes,but wastabulatedinthecensusasAmericanIndian/Black.Thesevenage/sexcategorieswere:1.Under182.18-29male 3.18-29female4.30-49male 5.30-49female6.50+male7.50+femaleThetwotenurecategorieswere:1.Owner2.NonownerThefourMSA/TEAcategorieswere:1.LargeMSAMailout/Mailback(MO/MB)2.MediumMSAMO/MB3.SmallMSAorNon-MSAMO/MB 4.AllotherTEAsMSA/CMSAFIPScodes,asdefinedbytheOfficeofManage-mentandBudget,wereusedforpost-stratification.For simplification,MSA/CMSAwillhereinbereferredtoasMSA.LargeMSAconsistsofthetenlargestMSAsbasedonunadjusted,Census2000totalpopulationcountsinclud-ingthepopulationinGroupQuarters.MediumMSAsare those(besidesthelargest10)thathaveatleast500,000 totalpopulation.SmallMSAsarethosewithatotalpopula-tionsizelessthan500,000.Forpost-stratificationpur-poses,MO/MBareaswerecontrastedwiththenon-MO/MB areas.Thetworeturnratecategorieswere:1.High2.LowReturnrateisatract-levelvariablemeasuringthepropor-tionofoccupiedhousingunitsinthemailbackuniversethatreturnedacensusquestionnaire.Low(high)returnratetractsarethosetractswhosereturnrateislessthan orequalto(greaterthan)the25thpercentilereturnrate.Separate25thpercentilecut-offvalueswereformedforthesixapplicablerace/Hispanicoriginbytenuregroups. | |||
PersonsinList/Enumerate,RuralUpdate/Enumerate,and UrbanUpdate/EnumerateTEAswereautomaticallyplacedintheHighcategory.Thefourregioncategorieswere:1.Northeast2.Midwest 3.South4.West Pre-CollapsingPre-collapsingwasdonepriortodatacollectionandknowledgeoftheexactsamplesizeineachpost-stratum. | |||
Allrace/Hispanicorigin,age/sex,andtenurecategoriesfortheU.S.wereinitiallymaintained.Theresearchforthedeterminationoftheimportantpost-stratificationvariables7-8SectionIChapter7DualSystemEstimationU.S.CensusBureau,Census2000 providedinformationontheexpectedsamplesizeineachcategorywhichwasthenusedtodefineacollapsinghier-archy.Thepre-collapsingplanfortheregion,MSA/TEA andreturnratevariableswasasfollows:*Non-HispanicWhiteorSomeotherraceOwners:No collapsing.*Non-HispanicWhiteorSomeotherraceNon-owners:Regionwaseliminated.*Non-HispanicBlack:Regionwaseliminated.InadditiontherewaspartialcollapsingoftheMSA/TEAvariablewithinreturnrateandtenurecategories.*Hispanic:Regionwaseliminated.InadditiontherewaspartialcollapsingoftheMSA/TEAvariablewithinreturn rateandtenurecategories.*NativeHawaiianorPacificIslander:Theregion,returnrateandMSA/TEAvariableswereeliminated.Onlyten-ureandage/sexwereretained.*Non-HispanicAsian:Theregion,returnrateandMSA/TEAvariableswereeliminated.Onlytenureandage/sexwereretained.*AmericanIndianorAlaskaNativeonReservations:Theregion,returnrateandMSA/TEAvariableswereelimi-nated.Onlytenureandage/sexwereretained.*Off-ReservationAmericanIndianorAlaskaNative:Theregion,returnrateandMSA/TEAvariableswereelimi-nated.Onlytenureandage/sexwereretained. | |||
Post-CollapsingA.C.E.post-stratificationincludedaplantocollapsepost-stratathatcontainedlessthan100(unweighted)P-samplepersons,calledpost-collapsing,consideringsuchapost-stratumtoosmalltoproducereliableestimates.Ifacol-lapsedpost-stratawasstilltoosmall,itcouldhavebeenfurthercollapsed.Thecollapsingprocedurewashierarchi-calandrequiredapre-definedcollapsingorder.Giventhe pre-collapsingplanthatyielded448post-strata,notmuch post-collapsingwasanticipated,butanextensivepost-collapsingstrategywasdesignedforcompletenessandtosatisfytherequirementofpre-specification.Notethatcollapsingdoesnotnecessarilyimplyelimina-tionofavariable.Collapsingcanrefertoareductioninthenumberofcategoriesforavariable.Thefollowinggeneraloutlinedescribesthepost-collapsinghierarchy thatwasplanned:*Ifanyofthe448post-strataaretoosmall,collapseage/sexfirst.Thismeansthatwithinanyofthe64U.S.post-stratumgroups,ifatleastoneoftheseven age/sexcategoriesdefinedinTable7-3haslessthan100P-samplepersons,reduceage/sextothefollowingthreecategories:Under18,18+male,and18+female.*Ifsomepost-strataarestilltoosmallandrequirecol-lapsing,collapseregionnext,ifapplicable.Thiscollaps-ingappliesonlytotheNon-HispanicWhiteorSome otherracedomainsincethevariableregionisonly includedintheirpost-stratificationdefinition.Inthis case,alllevelsofregion(Northeast,Midwest,South, West)arecombinedtoeliminatethevariable.*Next,collapsethefour-levelMSA/TEAvariable,intothefollowingtwogroups:*LargeandmediumMSAMO/MB | |||
*SmallMSAandnon-MSAMO/MBandallotherTEAs*Iffurthercollapsingisnecessary,returnrateisthenextvariabletocollapse.HighandLowreturnratecatego-riesarecombinedtoeliminatethevariable.*NextcollapsethevariableMSA/TEA.Ifnecessary,thetwogroupsdefinedabovewouldbecombinedtogethertoeliminatethevariableMSA/TEAcompletely.*Thenextvariabletocollapseistenure.Ownerandnon-ownercategoriesarecombinedtoeliminatethevariable entirely,ifnecessary.*Ifcollapsingisstillneeded,thethreeremainingage/sexpost-strataarecombinedtoeliminatetheage/sexvari-ablecompletely.*Intheeventthattherearelessthan100P-samplepersonsinarace/Hispanicorigindomain,combineallpersonsinthatdomainwithDomain7,whichincludesnon-HispanicWhiteandSomeotherrace.Inpractice,onlythefirststepofcollapsingwasnecessary.Eightofthe64post-stratumgroupshadtheir7age/sexpost-stratacollapsedto3age/sexgroups,resultingin32fewerpost-strata.Thus,therewere448-32=416post- | |||
strata.RaceandHispanicOriginClassificationsTheCensus2000questionnairehas15possibleraceresponses.The15responsesarecollapsedintosixmajorracegroupsasshownbelow.Racesthatareincludedinthemajorgroupsareshowninparentheses.Personsself-identifyingwithasingleraceessentiallyplacethemselves intooneofthesesixcategories.*White*Black(Black,AfricanAmerican,Negro) | |||
*AmericanIndianorAlaskaNative | |||
*Asian(AsianIndian,Chinese,Filipino,Japanese,Korean,Vietnamese,OtherAsian)*NativeHawaiianorPacificIslander(NativeHawaiian,GuamanianorChamorro,Samoan,OtherPacific Islander)SectionIChapter77-9DualSystemEstimationU.S.CensusBureau,Census2000 | |||
*Someotherrace(Therewasaboxonthequestion-nairelabeledSomeotherrace-Printracewithalineto enteranyracetherespondentdesired.)Forthefirsttimeincensushistory,personswereabletorespondtomorethanoneracecategory.Allowingpersons toself-identifywithmultipleracesresultsinmanymorethansixracegroups.Infact,aftercollapsingracetothesixmajorgroups,thereare2 6-1=63possibleracecom-binations.Itisnecessarytosubtractthe1inthisequationsinceeachindividualisassumedtohavearace.Theracevariabledefinedaboveisoftencross-classifiedwiththeHispanicoriginvariabletodefinepost-strata.TheHispanicoriginvariableconsistsoftworesponses,Noand Yes.CategoriesthatareincludedintheYesresponseareshowninparentheses.1.No,notSpanish/Hispanic/Latino2.Yes(Mexican,MexicanAmerican,Chicano,PuertoRican,Cuban,OtherSpanish/Hispanic/Latino)CombiningtheraceandHispanicoriginvariablesyields 63x2=126possiblerace/Hispanicorigingroups.ItisimportanttonotethatinasurveythesizeofA.C.E.,nopost-stratificationplanofinterestcansupport126race/Hispanicorigingroups.Consequently,eachofthe 126race/Hispanicoriginresponsepossibilitieswas assignedtooneofsevenrace/Hispanicorigindomains.Thesevenrace/Hispanicorigindomainsaredefinedas follows:1.AmericanIndianorAlaskaNativeonReservations2.Off-ReservationAmericanIndianorAlaskaNative3.Hispanic 4.Non-HispanicBlack5.NativeHawaiianorPacificIslander6.Non-HispanicAsian 7.Non-HispanicWhiteorSomeotherraceNotethatmissingraceandHispanicorigindataareimputed.Therulesusedtoclassifythe126raceandHis-panicorigincombinationsintooneoftheseven race/Hispanicorigindomainsarenowpresented.Manyofthedecisionsonhowmultipleracepersonswereclassifiedarebasedoncultural,linguistic,andsociologicalfactors, whichareknowntoaffectcoverageandarenotnecessar-ilydata-driven.Ahierarchywasusedtoassignpersonstoarace/Hispanicorigindomain.Therace/Hispanicorigindesignation occursinthefollowingorder:AmericanIndianorAlaskaNativeonReservations,Off-ReservationAmericanIndianorAlaskaNative,Hispanic,Non-HispanicBlack,Native HawaiianorPacificIslander,Non-HispanicAsian,andNon-HispanicWhiteorSomeotherrace.Thiscollapsingwasonlyusedforthepost-stratification,allcensusdataweretabulatedinaccordancewiththeraceandHispanicorigin categoriesselectedbycensusrespondents.Forthefollowingtables,IndianCountry(IC)isablock-levelvariablethatindicateswhetherablockis(whollyorpartly)insideanAmericanIndianreservation/trustland,OklahomaTribalStatisticalArea(OTSA),TribalDesignated StatisticalArea(TDSA),orAlaskaNativeVillageStatisticalArea(ANVSA).Tables7-4and7-5displaytheassignmentofrace/Hispanicorigindomains.Table7-4appliestoHispanicpersons,whileTable7-5appliestonon-Hispanicpersons. | |||
ThefirstsixrowsofTables7-4and7-5correspondtoasingleraceresponse.Theremainingportionofthetablesaddresstheassignmentofmultipleraceresponsestoa singlerace/Hispanicorigindomain.Althoughapersonmaybeassociatedwithmultipleraceresponses,eachpersonisincludedinonlyoneofthesevenrace/Hispanic origindomains.Allpersonswithacommonnumberareassignedtothesamerace/Hispanicorigindomain.Thenumberforeachrace/Hispanicorigindomainwas assignedasfollows:Domain1(IncludesAmericanIndianorAlaskaNativeonReservations).Thisdomainincludesanyper-sonlivingonareservationmarkingAmericanIndianorAlaskaNativeeitherastheirsingleraceorasoneofmanyraces,regardlessoftheirHispanicorigin.Domain2(IncludesOff-ReservationAmericanIndianorAlaskaNative).Thisdomainincludesanypersonliv-inginIndianCountry,butnotonareservationwhomarksAmericanIndianorAlaskaNativeeitherasasingleraceor asoneofmanyraces,regardlessoftheirHispanicorigin.ThisdomainalsoincludesanyNon-HispanicpersonnotlivinginIndianCountrywhomarksAmericanIndianor AlaskaNativeasasinglerace.Domain3(IncludesHispanic).ThisdomainincludesallHispanicpersonswhoarenotincludedinDomains1or2.AllHispanicpersons(excludingAmericanIndianorAlaska NativeinIndianCountry)areincludedinDomain3.TheonlyexceptiontothisruleoccurswhenaHispanicpersonlivesinthestateofHawaiiandclassifieshimselforherself asNativeHawaiianorPacificIslander,regardlessof whetherheorsheidentifieswithasingleormultiplerace.AllHispanicpersonssatisfyingthisconditionarere-classifiedintoDomain5.Domain4(IncludesNon-HispanicBlack).Thisdomainincludesanynon-HispanicpersonwhomarksBlackas theironlyrace.ItalsoincludesthecombinationofBlack andAmericanIndianorAlaskaNativenotinIndianCoun-try.Inaddition,peoplewhomarkBlackandanothersingleracegroup(NativeHawaiianorPacificIslander,Asian, White,orSomeotherrace)areincludedinDomain4.7-10SectionIChapter7DualSystemEstimationU.S.CensusBureau,Census2000 TheonlyexceptiontothisruleoccurswhenaNonHispanicBlackpersonlivesinthestateofHawaiiandclassifieshim-selforherselfasNativeHawaiianorPacificIslander.All Non-HispanicBlackpersonssatisfyingthisconditionare reclassifiedintoDomain5.Domain5(IncludesNativeHawaiianorPacific Islander).ThisdomainincludesanyNon-HispanicpersonmarkingthesingleraceNativeHawaiianorPacific Islander.ForNonHispanicpersons,italsoincludestheracecombinationofNativeHawaiianorPacificIslanderandAmericanIndianorAlaskaNativenotinIndianCountry. | |||
AlsoincludedistheracecombinationofNativeHawaiianorPacificIslanderwithAsianforNon-Hispanicpersons.All personslivinginthestateofHawaiiwhoclassifythem-selvesasNativeHawaiianorPacificIslander,regardlessoftheirHispanicoriginandwhethertheyidentifywithasingleormultiplerace,arealsoincludedinDomain5.Domain6(IncludesNon-HispanicAsian).Thisdomainincludesanynon-HispanicpersonmarkingAsianastheir singlerace.Ifapersonself-identifieswithAsianandAmericanIndianorAlaskaNativenotinIndianCountry,theyareincludedinDomain6.Domain7(IncludesNon-HispanicWhiteorSomeotherrace).Non-HispanicWhiteorNon-HispanicSomeotherracepersonsareincludedinDomain7.Non-Hispanicpersonswhoself-identifywithAmericanIndianorAlaskaNativenotinIndianCountryandareWhiteorSomeotherraceareclassifiedintoDomain7.IfaNative HawaiianorPacificIslanderresponseiscombinedwithaWhiteorSomeotherraceresponse,theyalsoare includedinDomain7.Apersonwhoself-identifieswithAsianandWhiteorAsianandSomeotherraceisalsoincludedinthisdomain.Finally,allNon-Hispanicpersonswhoself-identifywiththreeormoreraces(excluding AmericanIndianorAlaskaNativeinIndianCountry)areincludedinDomain7.TheonlyexceptiontothisruleoccurswhenaNon-HispanicWhiteorNon-HispanicSome otherracepersonlivesinHawaiiandclassifiesthem-selvesasNativeHawaiianorPacificIslander,regardlessofwhethertheyidentifywithotherraces.Personswhosat-isfythiscriteriaarere-classifiedintoDomain5.SectionIChapter77-11DualSystemEstimationU.S.CensusBureau,Census2000 Table7-4.Census2000A.C.E.Race/OriginPost-stratificationDomainsforHispanicIndiancountry(IC)NotinICIndiancountry(IC)Noton reservation On reservationSinglerace:AmericanIndianorAlaskaNative............321 Black.....................................333NativeHawaiianorPacificIslander..........*333 Asian.....................................333White....................................333Someotherrace.........................333AmericanIndianorAlaskaNativeand: | |||
Black.................................321NativeHawaiianorPacificIslander......*321Asian.................................321White................................321Someotherrace.....................321Blackand:NativeHawaiianorPacificIslander......*333Asian.................................333White................................333Someotherrace.....................333NativeHawaiianorPacificIslanderand:Asian.................................*333White................................*333Someotherrace.....................*333Asianand:White................................333Someotherrace.....................333AmericanIndianorAlaskaNativeand:TwoorMoreRaces....................*321AllElse**...............................*333*AllpersonslivinginthestateofHawaiiwhoclassifythemselvesasNativeHawaiianorPacificIslander,regardlessoftheirHispanicoriginandwhethertheyidentifywithasingleormultiplerace,areincludedinDomain5,whichincludesNativeHawaiianorPacificIslander.**AllElseencompassesallremainingcombinationsthatexcludeAmericanIndianorAlaskaNative.7-12SectionIChapter7DualSystemEstimationU.S.CensusBureau,Census2000 Table7-5.Census2000A.C.E.Race/OriginPost-stratificationDomainsforNon-HispanicNotinICIndiancountry(IC)Noton reservation On reservationSinglerace:AmericanIndianorAlaskaNative............221 Black.....................................444NativeHawaiianorPacificIslander..........555 Asian.....................................666White....................................777Someotherrace.........................777AmericanIndianorAlaskaNativeand: | |||
Black.................................421NativeHawaiianorPacificIslander......521Asian.................................621White................................721Someotherrace.....................721Blackand:NativeHawaiianorPacificIslander......*444Asian.................................444White................................444Someotherrace.....................444NativeHawaiianorPacificIslanderand:Asian.................................555White................................*777Someotherrace.....................*777Asianand:White................................777Someotherrace.....................777AmericanIndianorAlaskaNativeand:TwoorMoreRaces....................*721AllElse**...............................*777*AllpersonslivinginthestateofHawaiiwhoclassifythemselvesasNativeHawaiianorPacificIslander,regardlessoftheirHispanicoriginandwhethertheyidentifywithasingleormultiplerace,are includedinDomain5,whichincludesNativeHawaiianorPacificIslander.**AllElseencompassesallremainingcombinationswhichexcludeAmericanIndianorAlaskaNative.SectionIChapter77-13DualSystemEstimationU.S.CensusBureau,Census2000 VARIANCEESTIMATIONTheA.C.E.samplewasconsideredathree-phasesampletheinitiallistingsamplewasthefirstphase; A.C.E.reductionandsmallblockclustersubsamplingwas thesecondphase;andTargetedExtendedSearch(TES) wasthethirdphase.Multiphasesamplingdiffersfrom multistageinthefollowingway.Inamultistagedesign, theinformationneededtodrawallstagesofthesampleis knownbeforethesamplingbegins;inamultiphase design,theinformationneededtodrawanyphaseofthe sampleisnotavailableuntilthepreviousphaseiscom-pleted.Becauseofthemultiphasenatureofthedesign (housingcountsnotavailableuntilafterthefirst-phase listing),anewvarianceestimatorneededtobedeveloped. | |||
FulldetailsaregiveninStarsinicandKim(2001).OurgoalistoobtainavarianceestimatorfortheDualSys-temEstimator(DSE),oftheform: | |||
DSEDD (CE N e)(N nN i M n(M o N o)N i)(1)where: DDnumberofcensusdata-definedpersons CEestimatednumberofA.C.E.E-samplecorrectenumerations N eestimatednumberofA.C.E.E-sample persons N nestimatednumberofA.C.E.P-sample nonmovers N iestimatednumberofA.C.E.P-sample inmovers N oestimatednumberofA.C.E.P-sample outmovers M nestimatednumberofA.C.E.P-samplenonmovermatches M oestimatednumberofA.C.E.P-sampleoutmovermatchesTheDSEiscomputedseparatelyforeachpost-stratumdenotedbyh.Thenationalcorrectedpopulationestimateiscomputedas: | |||
TUSh DSE h (2)Thereisnoclosed-formsolutionforthevarianceestima-tor,andtheTaylorlinearizationvarianceestimatorisverycomplex.Thatleavesreplicationmethodologyastheonly practicalvarianceestimator.Specifically,astratifiedjack-knifeestimatorwasthetypeofreplicationmethodchosenfortheimplementation.Ajackknifeestimatoriscalculatedfromasetofreplicateswherethenumberofreplicatesisequaltothenumberofobservations(clustersinthiscase)inthesample.Each replicaterepresentswhattheDSEwouldhavebeenhadeachparticularclusternotbeenpartofthesample.Theoverallvarianceiscalculatedbysummingthesquaresof thedifferencesbetweenthereplicateDSEandthewhole-sampleDSE.ThemostimportantchallengefortheCensus2000A.C.E.varianceestimationwasthepreciseformforcalculatingthecontributionofreplicateDSEstothevarianceestima-tor;inparticular,newweightshadtobecalculatedforrep-licatestorepresenttheeffectofremovingtheclusterwhosereplicatewasbeingcalculated.NopreviousresultsweredirectlyapplicabletotheDSE,butamethodology wasdevelopedbasedontheworkofRaoandShao(1992).Theremainingpartofthissectiondescribesthepreciseformulasindetail.Theyrequiresomewhatcomplexnota-tionandmathematicalsteps.DetailedMethodologyAgeneralestimatorofatotalis: | |||
T yi w i y i (3)Theestimatorforthe j threplicateis T yii w ijy i (4)wherey iisthecharacteristicofinterest,andw i (j)isthereplicateweightforthei thunit,whichdiffersfromtheoriginalweightinaprespecifiedsubsetoftheobserva-tions.Withthesereplicateestimators,avarianceestimator canbeconstructed: | |||
V arT yj c jT yiT y2 (5)Beforecontinuing,wemustsetdownsomespecificnota-tion.Letw ibethefirstphasesamplingweight,andlety ibethecluster-leveltotalofanyofthesevenestimated componentsoftheDSE(CE,N n,etc.).LetAandA 2 indicatethefirstandsecondphasesamples,respectively.Letx ig=1ifunitiisingroup(secondphasestratum)gandzerootherwise.Letn hbethenumberofunitsselectedinfirst-phasestratumh.Letn gbethenumberofunitsinstratumhthatarealsoingroupg,andletr gbethenumberofthe n gunitsselectedinthesecondphase.Inallofthefollow-ingequations,jwillrepresentoneclusterthatisbeingdroppedtocalculateitsassociatereplicateestimateT (j);kisoneclusterotherthantheonebeingdropped.Fortwo-phasestratifiedsampling,therearetwodifferentpointestimators,theDoubleExpansionEstimator(DEE) | |||
DEEgiA 2 n g r g w i x ig y i (6)andtheReweightedExpansionEstimator(REE) | |||
REEgi2 (iw i x igiA 2 w i x ig)w i x ig y i (7)7-14SectionIChapter7DualSystemEstimationU.S.CensusBureau,Census2000 ThereisanestablishedresultbyRaoandShao(1992)whichgivesareplicatevarianceestimatorfortheREE undertwo-phasestratifiedsampling.Unfortunately,allthe individualcomponentsoftheDSE,suchasN e,thenumberofE-samplepeople,areDEEs.TakingacloserlookattheDEE,however,suggestedaprocedurethatcouldbe | |||
applied.n giA x ig , r giA 2 x ig DEEgiA 2 n g r g w i x ig y igi2 (kx kgk2 x kg)w i x ig y igi2 (w k x kg w k1k2 w k x kg w k1)w i x ig y i (8)TheDEEhasjustbeenrewritteninaformthatisquitesimilartotheREE.Thissuggeststhefollowinggeneraliza- | |||
tion: T y 2i2i y i , whereig (kw k x kg q kk2 w k x kg q k)w i x ig (9)andwhereq j=1fortheREEandw i-1fortheDEE.Replicatesarethennaturallywrittenas: | |||
T y 2jiA 2ijy i , whereijg (kw kjx kg q kk2 w kjx kg q k)w ijx ig (10)Whenq j=1(i.e.theREEcase),thereplicatevarianceesti-matorofthisgeneralizedestimator,basedonequation(5),isthesameastheREEreplicatevarianceestimatorofRaoandShao(1992).ApplicationToaThree-PhaseDualSystem EstimatorWithinanyofthesevencomponentsoftheDSEthataresubjecttosamplingerror(CE,N e ,N n ,N o ,N i ,M n,andM o),theclustersums(y i)canbebrokendownintotwocompo-nents:thetotalpriortoanyadjustmentsmadebyTES(u i),andtheadditionaltotalfromtheTESsample(v i).Thissec-ondpiececanbefurthersubdividedintoTEStotalsfromclusterssampledwithcertainty,andTEStotalsfromclus-terssampledsystematically.Theestimator(aDEE)ofoneofthecomponentsis Ty 3iA 2i u ik1 2i2i t k s ik a i v i (11)wheres ikisthethirdphasestratumindicator(s ik=1iftheclusterisselectedwithcertainty,0otherwise;s i2=1-s i1,anindicatorthattheclusteriseligibletobeselectedsystem-atically),a iisthethirdphasesampleindicator(a i=1iftheclusterisinA 3,0otherwise),andt k,theTESconditionalweight,isequalto t kiA 2 s iki2 s ik a iiA 2 s iki3 s iknumberofclusersselectedinphase2numberofclustersselectedinphase3 (12)Fors i1,thecertaintystratum,allclusterswithinithave a i=1,sot k=1forallclustersinthestratum.Tocreatethereplicateestimator,simplyapplywhatwaslearnedaboveinequations(8)and(10). | |||
Ty 3jiA 2iju i+k1 2iA 2ijt kjs ik a i v i (13)iA 2iju iiA 2ijt ljs il a i v iiA 2ijt 2js i 2 a i v iwhere, t 1j1 t 2jiA 2ijs i 2i1iA 2ijs i 2 a ii1ImplementationofVarianceEstimationforthe A.C.E.Thefirststepinimplementingthisvarianceestimationmethodologyiscalculatingthereplicateweights.Tothis point,themethodofreplicationusedtoarriveatthevari-anceisimmaterial,butwewillnowstatethatthejack-knifewillbeused.Letthereplicateweightsafterthefirst stageofsamplingbethestandardjackknifereplicate weights w ij{0 n h n h1ifij w hiifiandjareinthesamefirstphasestratum w hi otherwise (14)Then,thefinalweightsareobtainedbyapplyingequation (10).Notethatthisisanunusualformofthejackknife.Nor-mally,thejackknifehasasmanyreplicatesasobserva-tions.Here,thereare11,303clustersremainingafterthesecondphaseofthesample,butthenumberofreplicates isequaltothefirstphasessamplesizeof29,136clusters. | |||
Theclusterssampledoutinthesecondphaseobviouslydonotcontributetothevarianceduetothesecondandthirdphases,buttheymustbeincludedtoaccuratelySectionIChapter77-15DualSystemEstimationU.S.CensusBureau,Census2000 accountforthefirstphaseofsampling.Deletingaclus-terthatwassampledoutchangestheweightsoftheother clustersthatwereinthesamefirstphasesamplingstra- | |||
tum.Thesecondstepoftheimplementationistoadjusttheimputationofcertainprobabilitiestoaccountfortherepli-cation.Thisisacomponentofthevariancethatcanbeaccountedforbyincludingtheeffectofthereplicate weightsintheimputation.Forsomepersons,theirmatch,residence,orcorrectenumerationstatusremainsunre-solvedevenafterfollow-upoperations.Inthesecases,a probabilityforeachunresolvedstatusisimputedusinganimputationcelltechnique,witheachunresolvedcaseinanimputationcellgettingthesameimputedprobability.The generalformforthereplicatedimputationoftheprob-abilityforanunresolvedpersoninimputationcellkis: | |||
P r k*jresolvedpk w p*jt p*jP r presolvedpk w p*jt p*j(15)wherethesummationisoverallresolvedpersonsinimpu-tationcellk,and: | |||
w p*=person-levelweightforreplicatej,incorporatingallsamplingoperationsexceptTES,andnotincluding thenoninterviewadjustment t p*j{conditionalTESweightforreplicatej,theinverseoftheprobabilityofselectionintheTESsample,ifthepersonisaTESperson1ifthepersonisNOTaTESperson P r p{1ifapersonisamatchresidentcorrectenumeration0ifapersonisNOTamatchresidentcorrect enumeration}Tocompletetheestimationofthevariances,the29,136replicatedualsystemestimateswerecomputedforeach ofthe448post-strata: | |||
DSE hjCII(CEjN ej)(N njN ijM nj(M ojN oj)N ij)(16)Equation (13)wasusedfortheseparatecomputationofeachofthesevenreplicatedtermsoftheDSE:CE (j),N e (j), N n (j),N i (j),N o (j),M n (j),andM o (j).Thevarianceestimatesforpost-stratumhusedformula (5): V arDSE hj n 1, i1 n 1, iDSE hjDSE h2 (17)finally,thevarianceofthenationaladjustedpopulationestimateis: | |||
V arTUSpoststratumhpoststratumh'C ovDSE h,DSE h', where C ovDSE h,DSE hV arDSE h, and (18)C ovDSE h,DSE h'j n 1, i1 n 1, iDSE hjDSE hDSE h'jDSE h')Covariancesexistbetweenpost-stratamostlybecauseofcorrelationsbetweenmembersofthesamehousehold beingindifferentpost-stratabuthavingthesameprob-abilityofbeingincludedinthesample.Forinstance,withinagivenrace/Hispanicorigin/tenure/regiongroupthereexistssomecovarianceamongmales30-49,females 30-49andchildren0-17,becausesuchpersonsarelikelytoliveinthesamehousehold,andhence,showverysimi-larcensusandA.C.E.inclusionprobabilities.RESULTSThepercentnetundercount(UC)istheestimatednetundercount(ornetovercount)dividedbythedualsystem estimateforapost-stratumexpressedasapercentage.Apositivenumberimpliesundercoverage,whileanegativenumberimpliesovercoverage.Thepercentnetundercount forCensus2000showninthisdocumentisstrictlyforthehouseholdpopulationandexcludesgroupquartersper-sons.UC(DSEC DSE)100Table7-6presentstheestimatedpercentnetundercountforeachofthe64post-stratumgroups.Table7-7presentsthestandarderrorofeachoftheseestimates.ManymoreresultsareavailableinDavis(2001).7-16SectionIChapter7DualSystemEstimationU.S.CensusBureau,Census2000 Table7-6.Census2000A.C.E.64Post-StratumGroups-PercentNetUndercountRace/Hispanicorigindomainnumber*TenureMSA/TEAHighreturnrateLowreturnrateNEMWSWNEMWSWDomain7(non-HispanicWhiteor Someotherrace)OwnerLargeMSAMO/MB0.810.010.36-0.38-3.62-2.612.191.14MediumMSAMO/MB0.30-0.120.46-0.28-4.39-0.330.661.81 SmallMSA&Non-MSAMO/MB-0.250.140.440.302.292.612.092.71AllotherTEAs1.84-1.111.340.850.56-0.160.151.59NonownerLargeMSAMO/MB1.821.02MediumMSAMO/MB0.612.83 SmallMSA&Non-MSAMO/MB2.453.61 AllotherTEAs1.644.08Domain4(Non-HispanicBlack)OwnerLargeMSAMO/MB1.63-1.31MediumMSAMO/MBSmallMSA&Non-MSAMO/MB0.070.46AllotherTEAsNonownerLargeMSAMO/MB4.183.42MediumMSAMO/MBSmallMSA&Non-MSAMO/MB2.640.12AllotherTEAsDomain3 (Hispanic)OwnerLargeMSAMO/MB1.460.04MediumMSAMO/MBSmallMSA&Non-MSAMO/MB1.661.08AllotherTEAsNonownerLargeMSAMO/MB3.524.98MediumMSAMO/MBSmallMSA&Non-MSAMO/MB4.8810.74AllotherTEAsDomain5(NativeHawaiianor PacificIslander) | |||
Owner 2.71 Nonowner 6.58Domain6(Non-HispanicAsian) | |||
Owner 0.55 Nonowner 1.58 AmericanIndianor Alaska NativeDomain1 (On Reservation) | |||
Owner 5.04 Nonowner 4.10Domain2(Off Reservation) | |||
Owner 1.60 Nonowner 5.57*ForCensus2000,personscanself-identifywithmorethanoneracegroup.Forpost-stratificationpurposes,personsareincludedinasingleRace/HispanicOriginDomain.Thisclassificationdoesnotchangeapersonsactualresponse.Further,allofficialtabulationsarebasedonactual responsestothecensus.Anegativenetundercountdenotesanetovercount.SectionIChapter77-17DualSystemEstimationU.S.CensusBureau,Census2000 Table7-7.Census2000A.C.E.64Post-StratumGroups-StandardErroroftheNetUndercountin PercentRace/Hispanicorigindomainnumber*TenureMSA/TEAHighreturnrateLowreturnrateNEMWSWNEMWSWDomain7(non-HispanicWhiteor Someotherrace)OwnerLargeMSAMO/MB0.430.360.87-0.451.051.431.542.09MediumMSAMO/MB0.85-0.280.420.381.520.841.102.79SmallMSA&Non-MSAMO/MB1.330.400.430.573.602.121.081.49AllotherTEAs1.060.390.971.662.171.210.651.89NonownerLargeMSAMO/MB0.631.01MediumMSAMO/MB0.711.24SmallMSA&Non-MSAMO/MB0.511.24AllotherTEAs0.941.67Domain4(Non-HispanicBlack)OwnerLargeMSAMO/MB0.561.24MediumMSAMO/MBSmallMSA&Non-MSAMO/MB1.071.86AllotherTEAsNonownerLargeMSAMO/MB0.661.05MediumMSAMO/MBSmallMSA&Non-MSAMO/MB0.962.08AllotherTEAsDomain3 (Hispanic)OwnerLargeMSAMO/MB0.521.26MediumMSAMO/MBSmallMSA&Non-MSAMO/MB1.012.09AllotherTEAsNonownerLargeMSAMO/MB0.671.12MediumMSAMO/MBSmallMSA&Non-MSAMO/MB1.554.12AllotherTEAsDomain5(NativeHawaiianor PacificIslander) | |||
Owner 3.83 Nonowner 4.07Domain6(Non-HispanicAsian) | |||
Owner 0.87 Nonowner 0.98 AmericanIndianor Alaska NativeDomain1 (On Reservation) | |||
Owner 1.45 Nonowner 1.42Domain2(Off Reservation) | |||
Owner 1.95 Nonowner 2.02*ForCensus2000,personscanself-identifywithmorethanoneracegroup.Forpost-stratificationpurposes,personsareincludedinasingleRace/HispanicOriginDomain.Thisclassificationdoesnotchangeapersonsactualresponse.Further,allofficialtabulationsarebasedonactualresponsestothecensus.Anegativenetundercountdenotesanetovercount.7-18SectionIChapter7DualSystemEstimationU.S.CensusBureau,Census2000 Attachment.TheEffectofMoversonDualSystemEstimationThisattachmentdiscussestheeffectofmoversonDualSystemEstimation(DSE).ThreealternativemethodologiesforhandlingmoversinDSEhavebeenconsideredbytheU.S.CensusBureau.Historically,theyarereferredtoas PES-A,PES-B,andPES-C.However,thecurrentterminologyistorefertothemasProceduresA,B,andC.FollowingarethedefinitionsofthesemethodologiesasdescribedinU.S. | |||
BureauoftheCensus(1985).ProcedureA.Thisprocedurereconstructsthehouse-holdsastheyexistedatthetimeofthecensus.Arespon-dentisaskedtoidentifyallpersonswhowerelivingor stayinginthesamplehouseholdonCensusDay.Thesepersonsarethenmatchedagainstnamesonthecensusquestionnaireforthesampleaddress(andsurrounding area).Fromthisinformation,estimatesofthenumberand percentmatchedfornonmoversandoutmoverscanbe made.ProcedureB.Thisprocedureidentifiesallcurrentresi-dentslivingorstayinginthesamplehouseholdatthe timeoftheinterview.Therespondentisaskedtoprovidetheaddress(es)wherethesepersonswerelivingorstayingonCensusDay.Thesepersonsarethenmatchedagainst namesoncorrespondingcensusquestionnaire(s)atthenonmoversorinmoverscensusaddress.Estimatesofthenumberandpercentmatchedfornonmoversandinmov-erscanbemade.ProcedureC.Thisprocedureidentifiesallcurrentresi-dentslivingorstayingatthesampleaddressatthetimeoftheinterviewplusallotherpersonswholivedatthe sampleaddressonCensusDayandhavemovedsince CensusDay.However,onlytheCensusDayresidents(non-moversandoutmovers)arematchedwiththecensusquestionnaire(s)atthesampleaddress.Estimatesofthe numberofnonmovers,outmovers,inmovers,andtheper-centmatchedfornonmoversandoutmovers,canthenbemade.Estimatesofnonmoversandmoverscomefrom ProcedureBandmatchrateestimatesforthemoversfromProcedureA(usingoutmovermatching).Thus,ProcedureCisacombinationofProceduresAandB.In1990,ProcedureBwasused.Theunresolvedmatchrateforinmoversin1990washigh,around13percent.Inaddition,withsamplingfornonresponseinitiallyplannedforCensus2000,inmovermatchingwouldhavehadanevenhigherlevelofdifficulty.Adecisionwasmadethat ProcedureBwouldNOTbeusedforCensus2000.WhentheSupremeCourtdecidedagainstsamplingforappor-tionment(nosamplingfornonresponse),itwastoolatetochangethedecisiononProcedureB.Inthe1995and1996Censustests,ProcedureAwasused.TheU.S.CensusBureaureasonedthatanoutmovermatchratewouldbemoreaccuratethananinmovermatchrate,particularlywithsamplingfornonresponse.Foroutmovers,interviewersattemptedtoobtainthenames,newaddressesandotherdatathatcouldbeusedformatchingfromthenewoccupantsorneighbors.Then anattemptcouldbemadetotracethepeopletoobtainan interviewwithahouseholdmember.ThebestavailabledataforoutmoverswasmatchedtotheirCensusDayaddressesinthesamemannerasforthenonmovers.Outmovertracinghadproblemsin1995andwastestedin1996andintheCensus2000DressRehearsal.Theout-movertracingevaluationbyRaglinandBean(1999) showedthatthereislittlegaininanoutmovertracingoperation.AdecisionwasmadetousetheoutmoverproxyinterviewdataforoutmovermatchingforCensus | |||
2000.ProcedureCwastestedintheCensus2000DressRehearsalanditwasusedinCensus2000(Schindler, 1999).TheadvantageofProcedureCisthattheestimateofthenumberofmoversusesinmoverdata,whichismorereliablesinceitiscollectedfromtheinmoversthem-selves.Thematchrateofthemoversisestimatedusingtheoutmovermatchratesothatthedifficultiesofinmovermatchingareavoided.Outmovertracingisaproblem, however,andinmanycasesitisnecessarytouseproxy dataformatching.TherewasnooutmovertracingforCensus2000.ProcedureCattemptstoobtainaProcedureBestimatewithnoinmovermatching.ProcedureCand ProcedureBestimatesaredifferentsinceoutmoversdonothavethesamematchrateasinmovers.However,thedisadvantageoftheProcedureBinmovermatchrateesti-mateisthatitmayyieldahighpercentageofunresolved cases.SectionIChapter77-19DualSystemEstimationU.S.CensusBureau,Census2000 Chapter8.Model-BasedEstimationforSmallAreas INTRODUCTIONThischapterdocumentstheAccuracyandCoverageEvalu-ation(A.C.E.)methodologyofsyntheticestimationfor smallareasincludingtheestimationofsamplingvariancesofsyntheticestimatesandthegeneralizationofthevari-ances.Syntheticestimationistheparticularmodelused forcoverageadjustmentforsmallareasforA.C.E.First,thesyntheticestimationmethodologyandtheimpliedmodelaredescribed.Then,themethodologyforestimat-ingsamplingvariancesofthesesyntheticestimatesandforgeneralizingthesevariancesarediscussed.SYNTHETICESTIMATIONMETHODOLOGYFORSMALLAREAS BackgroundAsdiscussedinChapter7,dualsystemestimates(DSE)andcoveragecorrectionfactorswerecalculatedatthe post-stratumlevel.ThesearedirectA.C.E.Surveyesti-mates,basedonlyondatafromsampleunitsinthepost-stratum.However,censuscountsadjustedforcoverageerroraredesirableforsmallgeographicareasmuch smallerthananypost-stratumsuchasblocks,tracts,countiesandcongressionaldistricts.Theadjustedcountswereexpectedtoimprovedatausedforcongressional redistrictingaswellasstates,mostmetropolitanareas, andlargercountiesandcitiesandtoprovideconsistenttotalswhencensusdataareaggregatedovermanysmallareas.ManyoftheseareasdonotincludeanyA.C.E. | |||
sampleunits,makingadirectestimateimpossible(seeChapter3fordetailsofA.C.E.sampling).ThegeographicareasthatincludeA.C.E.sampleunitsonlyhaveasmall numberofsampleunits.Adirectestimatewouldresultinunacceptablylargestandarderrors.SyntheticestimationisdiscussedinGhoshandRao(1994),Gonzalez(1973),and GonzalezandWaksberg(1973).Gonzalez(1973)describes syntheticestimationasfollows:Anunbiasedestimateisobtainedfromasamplesurveyforalargearea;whenthisestimateisusedtoderiveestimatesforsubareasunder theassumptionthatthesmallareashavethesamecharac-teristicsasthelargearea,weidentifytheseestimatesassyntheticestimates.Syntheticestimationwasfirstused bytheNationalCenterforHealthStatistics(1968)tocalcu-latestateestimatesoflongandshorttermphysicaldis-abilitiesfromtheNationalHealthInterviewSurveydata (GhoshandRao,1994).Syntheticestimationisauseful procedureforsmallareaestimation,mainlyduetoitssim-plicityandpotentialtoincreaseaccuracyinestimationbyborrowinginformationfromsimilarsmallareas.SyntheticestimationwasusedforCensus2000toprovideadjustedpopulationestimatesforsmallgeographicareas suchasblocks,tracts,counties,andcongressionaldis-tricts.Theseblock-levelestimatescanthenbeaggregatedtoanygeographiclevel.Thesyntheticestimatesproviderevisedpopulationcountsforbothallpersonsandper-sons18andover.CountsarealsoprovidedforHispanicor Latinopersonsbyrace(63categories)andNotHispanicorLatinopersonsbyrace(63categories)forboththetotalpopulationandthepopulation18yearsandover.Forexample,countsofsingle-raceAsianpersonswhoareNotHispanicorLatinoaregivenforboththetotalpopulationandthepopulation18yearsandover.Countsofsingle-raceAsianswhoareNotHispanicorLatinowhoareless than18yearsofagecanbeobtainedbysubtraction.Syntheticestimatesareformedbycombiningcoveragemeasurementresultswithcensuscountstoproducepopu-lationestimatesforanygeographicareaofinterest.Forexample,ablock-levelsyntheticestimateisformedbydistributingapost-stratumscoveragecorrectionfactorto blocksproportionaltothesizeofthepost-stratumspopu-lationwithintheblock.Rounded,adjustedsyntheticesti-matesatthetabulationblocklevelconstitutetheadjusted redistricting 1datafile.Thesyntheticestimationmodelassumesthatcoveragecorrectionfactorsareuniformwithinagivenpost-stratum,meaningthatthecoverageerrorrateforagivenpost-stratumisthesamewithinallblocks.Totheextentthat thesyntheticassumptionisincorrect,theestimatesofcoverageforindividualareasarebiasedand,hence,soarethepopulationsizeestimatesbasedonthecoverage correctionfactors.Syntheticestimationbiasdecreasesasthesizeofthegeographicareaincreases.SyntheticEstimationThissectiondescribesthecalculationofsyntheticesti-mates.Syntheticestimationincludesacontrolledroundingprocedureusedtoproduceestimatesthatareinteger-valued.Thevisualrepresentationofthetwelvestepsin thecontrolledroundingprocessgiveninHaines(2001)is providedhere. | |||
1SinceitwasoriginallyintendedthattheA.C.E.mightbeusedtoadjustcensuscountsforredistricting,suchdataiscalledredistrictingdata,althoughitwasnotultimatelyusedforthat purpose.SectionIChapter88-1Model-BasedEstimationforSmallAreasU.S.CensusBureau,Census2000 CalculationConsiderformingsyntheticestimatesforgeographiclevelgforagivenpost-stratum.LetC i,gdenotethecensuscountforpost-stratumiingeographiclevelganddefine | |||
CCF itobethecoveragecorrectionfactorforpost-stratumi.Thegeneralformforasyntheticestimateforpost-stratumiatgeographiclevelgiscalculatedas Ni , g SC i , gCCF i.Aggregatingsyntheticestimatesoverallthepost-strataingeographiclevelgyieldsasyntheticestimateforthetotal populationofgeographiclevelg.Thisisdenotedas Ng si C i , gCCF i.Onepurposeofsyntheticestimationandthecontrolledroundingprocedureistoproduceinteger-valuedadjustedsyntheticestimatesatthetabulationblocklevel.Then, summingoverdifferentgeographieswithinalargerarea yieldsthesameestimateasthatforthelargergeographicarea.Theseestimatescomprisetheadjustedredistrictingdatafile.GeographyComponentsofsyntheticestimatesusetwoslightlydiffer-entorganizationsofgeography.Bothcollectionandtabu-lationblocksareusedinthesyntheticestimationprocess. | |||
Acollectionblockisageographicareausedduringcensusdata-collectionactivities.TheHundred-PercentCensusEditedFile(HCEF)isbasedoncollectionblockgeography. | |||
Tabulationblocks,ontheotherhand,aregeographicareas usedfortabulatingcensusdata.TheHundred-PercentDetailFile(HDF)isbasedontabulationblockgeography.Syntheticestimationcensuscountsarebasedontabula-tionblockgeographywhilethecoveragecorrectionfac-torsassociatedwithpost-strataarebasedoncollectionblockgeography.Thiscouldhaveramificationsonvari-ableswithageographiccomponent,althoughanysucheffectsareprobablysmall.Forexample,considerthepost-stratificationvariablereturnrate.Returnratewascalculatedatthetractlevel andbasedoncollection-tractdefinitions.Peoplewereassignedtopost-stratabasedonthereturnrateoftractsdefinedusingcollectionblocks.Nowconsiderthecase wherepeopleareassignedtopost-stratabasedonthe returnrateoftractsdefinedusingtabulationblocks.Itcouldbethecasethatthechangeingeographycausesanindividualspost-stratumassignmenttochange.For example,supposethereturnrateofacollection-tractis80percentandthatthecollectiontractissplitintotwopiecesbyatabulation-tract.Apersonwhobelongedtothe collection-tract(withan80percentreturnrate)maynow belongtoatabulation-tractwithadifferentreturnrate.Changesinanindividualspost-stratumwouldalsocausechangesinthedualsystemestimates,coveragecorrection factors,andsyntheticestimates.Toavoidpotentialincon-sistenciesintheassignmentofpeopletopost-strata,there wasonlyoneassignmentofpeopletopost-strata.The assignmentwasbasedoncollection-blockgeography, whichwasconsistentwiththegeographyusedinthe A.C.E.Further,thispost-stratificationassignmentwas maintainedforallestimationpurposes.ControlledRoundingSyntheticestimatesatanygeographiclevelarenottypi-callyinteger-valued.Acontrolledroundingprogram, developedbytheStatisticalResearchDivision(SRD)oftheU.S.CensusBureau,wasutilizedthatproducesinteger-valuedestimates.Thetheoryofcontrolledroundingis giveninCoxandErnst(1982).Theproblemisrepresentedasatransportationtheoryproblemtominimizeanobjec-tivefunctionthatmeasuresthechangeduetocontrolled rounding.Inessence,thecontrolledroundingprogramtakesatwo-dimensionalmatrixofnumbersandroundseachtoanadjacentintegervaluebasedonanefficiency algorithm.Anoptimalsolutionthatminimizesthechangeduetocontrolledroundingisguaranteed;therecan,how-ever,bemorethanoneoptimalsolution.Thetwodimen-sionsofthematrixare:1)thepost-strataforonelevelof geography;and2)totalsforalowerlevelofgeography.Thecontrolledroundingprocedureensuresthatthesumofthesyntheticestimateswithinageographiclevelare roundedupordownbyanamountstrictlylessthanone person.Theoverallgoalofcontrolledroundingwastoobtainanintegernumberofpersonsforeachpost-stratumiwithin eachtabulationblockb,reflectingtheestimatesofover-countandundercount.Thecontrolledroundingprogramcouldnotbeimplementedinonestepduetothesizeof thepost-stratabytabulationblockmatrix.Asaresult,controlledroundingwasimplementedinstepssuchthattherounded,adjustedsyntheticestimatesforblocks sumto:*therounded,adjustedsyntheticestimatesfortracts, | |||
*therounded,adjustedsyntheticestimatesforcounties, and*therounded,adjustedsyntheticestimatesforstates.Inotherwords,theblock,tractandcountyrounded,adjustedsyntheticestimateswouldallbeconsistentwitheachother.Also,thestate-levelsyntheticestimatesare adjustedinordertoguaranteethattotalpopulationesti-matesatthestatelevelsumtothenationaltotalpopula-tionestimate.AcontrolledroundingprocedurefortheU.S.canbeimple-mentedasfollows:8-2SectionIChapter8Model-BasedEstimationforSmallAreasU.S.CensusBureau,Census2000 1.Formtheratioofthecontrol-roundeddualsystemestimate(DSE R)totheunroundedDSEforpost-stratumi.Itiswrittenas DSE i R DSE i2.Foreachpost-stratumiwithinstates,multiplythestate-levelsyntheticestimatebytheratioformedinstep1.ThesuperscriptASdenotesanadjustedsyn-theticestimate.Theresultingproductistheadjustedsyntheticestimateforpost-stratumi,withinstateswrittenas Ni , s ASNi , s S[DSE i RDSE i]where Ni , s SC i , sCCF i.3.Applythecontrolledroundingproceduretotheadjustedstate-levelsyntheticestimatestoproducerounded,state-levelsyntheticestimates,denoted Ni , s RS.ThesuperscriptRSdenotesarounded,syntheticesti-mate.Thetwodimensionsofthismatrixarestatesbypost-stratumi.4.Calculatetheratiooftheroundedstate-levelsyntheticestimatetothestate-levelsyntheticestimateforpost-stratumiinstates.5.Foreachpost-stratumiwithincountycforstates,multiplythecounty-levelsyntheticestimatebytheratioformedinstep4.Theresultingproductisthe adjustedcounty-levelsyntheticestimateforpost-stratumi,writtenas Ni , c ASNi , c S[Ni , s RSNi , s S]where Ni , c SC i , cCCF i.6.Applythecontrolledroundingproceduretotheadjustedcounty-levelsyntheticestimatestoproducerounded,adjusted,county-levelsyntheticestimates, denoted Ni , c RS.Thetwodimensionsofthismatrixarecountyc(instates)bypost-stratumi(instates).7.Formtheratiooftherounded,adjusted,county-levelsyntheticestimatetothecounty-levelsyntheticesti-mateforpost-stratumiincountycinstates.8.Foreachpost-stratumiwithintracttincountycforstates,formtheproductofthetract-levelsynthetic estimateandtheratioformedinstep7.Thisresultsin theadjustedtract-levelsyntheticestimateforpost-stratumi,writtenas Ni , t ASNi , t S[Ni , c RSNi , c S]where Ni , t SC i , tCCF i.9.Applythecontrolledroundingproceduretotheadjustedtract-levelsyntheticestimatestoproduce rounded,adjustedtract-levelsyntheticestimates, denoted Ni , t RS.Thetwodimensionsofthismatrixaretractt(incountycinstates)bypost-stratumi(in countycinstates).10.Calculatetheratiooftherounded,adjustedtract-levelsyntheticestimatetothetract-levelsyntheticestimateforpost-stratumiintracttincountycinstates.Post-stratumiState12..i..1 Ni , s AS 2: s:Post-stratumi State12..i..1 Ni , s RS 2: s:Post-stratumiinstates County12..i..1 Ni , c AS 2: c:Post-stratumiinstates County12..i..1 Ni , c RS 2: c:Post-stratumiincountycinstatesTract12..i..1 Ni , t AS 2: t:Post-stratumiincountycinstatesTract12..i..1 Ni , t RS 2: t:SectionIChapter88-3Model-BasedEstimationforSmallAreasU.S.CensusBureau,Census2000 11.Foreachpost-stratumiwithinblockbintracttincountycforstates,multiplytheblock-levelsyn-theticestimatebytheratioformedinstep10.The resultingproductistheadjustedblock-levelsyn-theticestimateforpost-stratumi,writtenas Ni , b ASNi , b S[Ni , t RSNi , t S]where Ni , b SC i , bCCF i.12.Again,applythecontrolledroundingproceduretotheadjustedblock-levelsyntheticestimatestopro-ducerounded,adjustedblock-levelsyntheticesti-mates,denoted Ni , b RS.Thetwodimensionsofthismatrixareblockbintracttincountycinstatesbypost-stratumiintracttincountycinstates.RecordReplicationforCoverageCorrectionOncetherounded,adjustedblock-levelsyntheticesti-mateswereformed,theywerecomparedwiththecensus countsforpost-stratumiintabulationblockb.Person recordswerethenreplicatedatthepost-stratumleveltoreflectthecoveragecorrectionforthecensusblocks.Noattemptwasmadetoplacethesepersonsinhouseholds. | |||
Thus,forexample,thenumberofpersonsperhouseholddoesnotchangeduetocoveragecorrection.Thenumberofrecordsreplicateddependsonthevalueofthecoverage correctionfactorthatisreflectedintheroundedsyntheticestimateforpost-stratumiandtabulationblockb.CoverageCorrectionFactors1Whenthecoveragecorrectionfactorforpost-stratumiwasgreaterthanone,undercountpersonrecordswererepli-catedtoreflecttheundercountinpost-stratumiandblock basfollows: | |||
U i , bNi , b RSC i , b If U i , b=0,thennoadditionalrecordswerenecessary.If U i , b>0,thenwereplicated U i , bundercountpersonrecordsforpost-stratumiintabulationblockb.Undercountpersonrecordswerereplicatedbyrandomlyselectingwithoutreplacement U i,brecordsfromthe C i,bavailablepersonrecordsinpost-stratumiandtabulationblockb.Theselectedrecordswerereplicatedandappendedtothefileofpersonrecords.Theundercount personrecordforeachofthereplicatedrecordswasgiven aneffectiveweightof+1fortabulations.Thisresultedin anupwardadjustmentofpeopleinpost-stratumiin tabulationblockb.CoverageCorrectionFactors<1Whenthecoveragecorrectionfactorforpost-stratumiwaslessthanone,overcountpersonrecordswerereplicatedtoreflecttheovercountinpost-stratumiandblockbas | |||
follows: O i , bC i , bNi , b RS If O i , b>0,then O i , bovercountpersonrecordswererepli-catedforpost-stratumiintabulationblockb.Overcountpersonrecordswerereplicatedbyrandomlyselectingwithoutreplacement O i , brecordsfromthe C i,bavailablepersonrecordsinpost-stratumiandtabulationblockb.Theselectedrecordswerereplicatedand appendedtothefileofpersonrecords.Theovercountper-sonrecordforeachofthereplicatedrecordswasgivenaneffectiveweightof-1fortabulations,resultinginadown-wardadjustmentofpeopleinpost-stratumiintabulation blockb.VARIANCEESTIMATIONFORSMALLAREASEstimatingtheerrorduetosamplingforanypublishedestimateisapolicyoftheCensusBureau.Thispolicyappliestosyntheticestimatesaswellasthemoretradi-tionalestimates.Duetothelargenumberofestimatesat lowerlevelsofgeography,itisnotfeasibletoprovide tableslistingthestandarderrorofeachpublishedesti-mate.Instead,aparameter,thegeneralizedcoefficientofvariation(GCV),isprovided,thatallowsuserstoapproxi-matethestandarderrorforanydesiredestimate.Thecoefficientofvariationofanestimateissimplytheratiooftheestimatesstandarderrortotheestimateitself.Smallareavarianceestimationisatwo-stepprocess.Thefirststepconsistsofproducingdirectvarianceestimatesforthesyntheticcountestimatesforsmallareassuchascensustracts.Thisprocessisexplainedundertheheading DirectVarianceEstimates.Thesecondstepistomodelthe directvarianceestimatesusingthegeneralizedcoefficientofvariation,orGCV.ThismethodisexplainedundertheheadingGeneralizedVarianceEstimates,alongwithan | |||
example.Variancescalculatedforsmallareasdonotaccountforallsourcesofsyntheticerror;theyonlyreflectvariationsdue tosampling.Syntheticpopulationbiascanexistsincethesamecoveragecorrectionfactorsareappliedtoareaswithdifferentnetcensuscoverage.SeeGriffinandMalec (2001)fordetailsonestimatingsyntheticbias.Inmostverysmallgeographicareassuchasblocksandtracts,thePost-stratumiintracttincountycinstates Block12..i..1 Ni , b AS 2: b:Post-stratumiintracttincountycinstates Block12..i..1 Ni , b RS 2: b:8-4SectionIChapter8Model-BasedEstimationforSmallAreasU.S.CensusBureau,Census2000 biasesarelikelytobetheprincipalsourceoferrors.Sam-plingerrorsdominatethetotalerrorforlargerareassuch asstates,metroarea,etc.Biasinthepost-stratum-level dualsystemestimatescanstemfrommatchingbias,data collectionerrors,andcorrelationbias,amongother sources.Bell(2001)investigatesandestimatescorrelation biasintheA.C.E.dualsystemestimatesbycomparing themtoresultsfromDemographicAnalysis.DirectVarianceEstimatesDuringthepost-stratum-levelA.C.E.varianceestimationoperation,avariance-covariancematrixoftheA.C.E.cov-eragecorrectionfactors(CCFs)wasproduced.Theesti-matedvarianceofanysyntheticpopulationestimatecanbecomputedusingthismatrixandtheunadjustedcensus counts,brokendownbypost-stratumandexcludingout-of-scopepersonsintheA.C.E.SeeStarsinic(2001)fordetails.Asynthetichouseholdpopulationestimate(Group Quarterspersonsarenotincluded)fortracttiswrittenas Xtpost-stratah Xthh1 416 C thCCF h'where C thisthefinal,unadjustedcensuscountforpost-stratumhintractt.Therewere416post-stratausedtoestimatecoverage.Thevarianceforthesynthetichouseholdpopulationesti-mate Xt is V arXtV arh1 416 Xthh1 416h'1 416 C ovXth ,Xth'h1 416h'1 416 C ovC thCCF h ,C th'CCF h'h1 416h'1 416 C thC th'C ovCCF h,CCF h'.Foragivendataitemjintractt,thesyntheticvarianceforthesynthetichouseholdpopulationestimate Xjt isexpressedas V arXjth1 416h'1 416 C jthC jth'C ovCCF h,CCF h',(1)where C jthisthefinal,unadjustedcensuscountfordataitemjinpost-stratumhintractt.Herehandh'refertoparticularpost-strataandjreferstoadataitem.GeneralizedVarianceEstimatesThegeneralizedcoefficientofvariation(GCV)isthevari-anceestimationmethodologyusedforestimatingvari-ancesofadjustedredistrictingdataandforestimatesofadjustedpopulationcountsforthethousandsofgeo-graphicareasthatcanbetabulatedusingsyntheticesti-mation.Foragivencountinaparticularstate,thecoeffi-cientofvariation(CV)wascalculatedforalltractsinthatstatethathadpopulationintheparticulardemographic category.TheCVofanestimateisestimatedastheratioofthestandarderroroftheestimatetotheestimateitself, i.e.CVX=SEXX.Thestandarderrorinthenumeratoristhesquarerootofthevarianceestimatefrom(1).Tractscomposedentirelyofpersonsout-of-scopefortheA.C.E.samplehadnosam-plingvariance(andthereforeaCVof0)andwereremoved fromtheprocessing.Alsoremovedweretractswithavery smallpopulationinthedemographiccategory,asthesewereshownintheCensus2000DressRehearsalanalysistohaveadisproportionatedownwardeffectontheparam-eters.Theprocessofremovingtractswascontrolledtopreventremovinganoverlylargefractionofsmalltractsforanyadjusteddemographicdataitem.Inaddition,outli-erswereidentifiedusingtherelativeabsolutedeviation(RAD)statisticforeachdataitemj.TractswithaRADvalueabovethecutoffvaluewereremovedandanew GCVwascomputedusingCVsofremainingtracts.There werefouriterationsofidentifyingandremovingoutliers.Ofthe286uniquedemographiccategories,GCVswerecalculatedforthe50statesandtheDistrictofColumbia foreachofthe56largestcategoriesand4additionalcatch-allgroups.TheaverageofthedirectCVsfordataitemsinastateisaGCVparameter.Thestate-levelGCVparameterscanthenbeusedtoestimatethestandarderrorofadataitemfor allgeographicareaswithinthatstate.Considerthefollow-ingtableofGCVparametersforagivenstate.SectionIChapter88-5Model-BasedEstimationforSmallAreasU.S.CensusBureau,Census2000 StateParametersforCalculatingtheStandardErrorofA.C.E.-AdjustedDataDemographiccategoryAllpersonsNotHispanicorLatinoAllages18andoverAllages18andoverGCVGCVGCVGCVAllpersons | |||
...............................................0.00630.00670.00660.0069HispanicorLatino.. | |||
.......................................0.01060.0115XXPopulationofonerace | |||
....................................0.00640.00670.00660.0069Whitealone | |||
............................................0.00730.00770.00810.0083BlackorAfricanAmericanalone | |||
..........................0.00730.00830.00730.0083AmericanIndianandAlaskaNativealone | |||
.................0.01430.01470.01880.0190Asianalone.. | |||
..........................................0.00800.00850.00810.0086NativeHawaiianandOtherPacificIslanderalone... | |||
.......0.03910.04950.05070.0545SomeOtherRacealone | |||
.................................0.01090.01190.01260.0139Populationoftwoormoreraces | |||
............................0.00700.00770.00710.0082Populationoftworaces | |||
...................................0.00710.00780.00710.0082White;BlackorAfricanAmerican | |||
.........................0.01030.01560.01030.0157White;AmericanIndianandAlaskaNative | |||
.................0.00880.00920.00960.0100White;Asian | |||
...........................................0.01160.01310.01200.0133BlackorAfricanAmerican;AmericanIndianandAlaska Native................................................0.01290.01400.01280.0140Asian;NativeHawaiianandOtherPacificIslander | |||
..........0.05240.05600.05300.0566Allothercombinationsoftwoormoreraces | |||
.................0.00880.00950.00880.0099Supposeadatauserisinterestedincalculatingthestan-darderrorofthepopulationestimateofallAsiansina givencounty.ThedatauserwouldlocatetheGCVparam-eterthatcorrespondstotheAsianalonedemographic categoryandtheAllpersons,Allagesclassificationin theappropriatestatetable.Forthetableabove,theGCV parameteris0.0080.NowassumethatthepopulationestimateforallAsiansinthiscountyis370people.Usersareinstructedtousetheformula SEX=GCVX,tocalculatetheestimatedsyntheticstandarderror,yield-ing0.0080x370=2.96,orabout3peopleinthisexample.Similarcalculationscanbedoneforanygeo-graphiclevelanddemographiccategory.8-6SectionIChapter8Model-BasedEstimationforSmallAreasU.S.CensusBureau,Census2000 AppendixA.Census2000MissingData INTRODUCTIONTheCensusBureauusedimputationinthe2000DecennialCensus,asithasinpriorcensuses,toaddresstheprob-lemofmissing,incomplete,orcontradictorydata,an inevitableaspectofcensusesandsurveys.Itisimpossiblenottohavemissingdatainanendeavorasmassiveandcomplexasadecennialcensus.InCensus2000,theCen-susBureauprocesseddataforover120millionhouse-holds,includingover147millionpaperquestionnairesand1.5billionpagesofprintedmaterial.Inthe2000Cen-sus,thevarioussituationsthatresultedinmissingdataincludedincompleteorunavailableresponsesfromhous-ingunitswithpreviouslyconfirmedaddresses,conflicting dataaboutthesamehousingunit,andfailuresinthedata-captureprocess.Thevarioustypesofmissingdataincludedcharacteristicdata(informationaboutanenumer-atedperson,suchassex,race,age),populationcount data(informationaboutthenumberofoccupantsinanidentifiedhousingunit),andhousingunitstatusdata(whethertheunitisvacant,occupied,ornonexistent).The 2000Censususedtwoprimarytypesofimputation.1.Thefirsttype,calledcountimputation,isimputationofthenumberofoccupantsofahousingunit.Count imputationapplieswhentheCensusBureauisunable tosecureanyinformationregardingagivenaddress,orwhentheCensusBureauhaslimitedinformationabouttheaddressanddoesnothavedefinitiveinfor-mationonthenumberofoccupants.2.Thesecondtype,calledcharacteristicimputation,isimputationthatsuppliesmissingcharacteristicdataforahousingunitsresponse,butdoesnotinvolvethe numberofoccupantsforahousingunit.Forexample,ifagivenhousingunitdidnotprovideagesfortheindividualslivinginthehousingunit,butsuppliedall otherinformation,agewouldbeimputedfortheindi-vidualsinthathousingunit.Sometimesthehouseholdsizeisknownforthehousingunit;however,noneof thecharacteristicsaboutthepeopleareknown.Inthis caseallofthepersonscharacteristicsareimputed. | |||
1Thisappendixsummarizesthemethodsusedtoimputethesetypesofmissingdatainthecensus.Somesummarystatisticsshowingthedegreeofimputationforthesecat-egoriesisgiveninthelastsectionofthisappendix. | |||
BACKGROUNDThecensusdatacollectionactivitiesstartedaroundmid-Marchof2000,throughthemailordirectlyusingcensus-enumerators.FromJunetoSeptember,censusstaffcon-ductednonresponsefollow-up(NRFU)andcoverageimprovementfollow-up(CIFU)operationstorevisit addressesforwhichcensusreportswerenotcompleted,i.e.didnotrespondtomailout/mailbackorearlyenumera-tionoperations.Basedontheresultsoftheseoperations, theCensusBureauwasabletodesignatemorethan99.5percentofhousingunitrecordsasoccupied,vacant,ornonexistenthousingunitaddresses.Todesignatean addressasvacantornonexistentrequiredatleasttwoindependentcensusoperations.Thiswastoensurecom-pletecensuscoverage.Thenonexistenthousingunits wereaddressesofplacesusedonlyfornonresidentialpur-poses,orplacesthatwereuninhabitableandwerenotincludedinthecensuscounts.Topermittheproductionofcensuspopulationcounts,itwasnecessaryforeachcensusaddresstohaveastatusofoccupied/vacant/nonexistentandahouseholdsizeifoccupied.Topermittheproductionoftheredistrictingfile andothermoredetailedcensusproducts,itwasalsonec-essarytohaveinformationabouteachpersonsuchasage,race,andsex.Thecountimputationcoveredstatusfor housingunitswithundeterminedstatusandhousehold sizeforoccupiedunitswithanunknownnumberofoccu-pants.Thecharacteristicimputationwasusedtofillinthemissingpersondata.CensushousingunitsidentifiedintheAccuracyandCov-erageEvaluation(A.C.E.)blockclustersweredefinedastheE-samplehousingunits.Personsresidinginthese housingunitswereE-samplepersons.Ittookseveraldif-ferentcensusoperationstoestablishalistofcensushous-ingunitrecordsandalistofcensuspersonrecords.One oftheseoperationswasthecreationofaHundredPercentCensusUneditedFile(HCUF).Atthehousingunitlevel,allhousingunitsdesignatedasoccupiedorvacantthrough datacollectionorthroughimputationwereincludedinthe HCUF.ThefilewasusedasasourcefiletoidentifytheE-samplehousingunitsfortheA.C.E.operations.Atthepersonlevel,theHCUFwasusedasasourcefileforper-sonmatchingbetweenthecensusandtheA.C.E.(How-ever,thisdoesnotincludeimputedpersons,sincetheywerenotsenttoA.C.E.matching).Chapter3provides detailedinformationonE-sampleidentification,while Chapter4providesinformationonpersonmatching. | |||
1Thisdoesnotincludegeographiccharacteristicssuchasloca-tion,urbanorruralresidencyetc.,whicharegenerallyknownforallhouseholds.SectionIAppendixAA-1Census2000MissingDataU.S.CensusBureau,Census2000 Personsimputedtoanoccupiedunitwithanunknownnumberofoccupantsorpersonswithalltheircharacteris-ticsimputedwereconsideredasnon-data-definedpersons inthepersonDualSystemEstimation(DSE).Fordata-definedpersons,characteristicimputationfilledincensus missingdata,suchassex,age,ethnicity,andowner | |||
/renterstatusforpersonDSEpoststratificationpurposes.COUNTIMPUTATIONTheCensusBureauusedcountimputationforthreecat-egoriesofcasesinCensus2000. | |||
1.Householdsizeimputation.TheCensusBureauimputedthenumberofoccupantsforahousingunitwhenCensusBureaurecordsindicatedthatthehous-ingunitwasoccupied,butdidnotshowthenumber ofindividualsresidingintheunit. | |||
2.Occupancyimputation.WhenCensusBureaurecordsindicatedthatahousingunitexisted,butnot whetheritwasoccupiedorvacant,theCensusBureauimputedoccupancystatus(occupiedorvacant).Iftheunitwasimputedtobeoccupied,thehouseholdsize wasalsoimputed. | |||
3.Statusimputation.WhentheCensusBureausrecordshadconflictingorinsufficientinformation aboutwhetheranaddressrepresentedavalid,nondu-plicatedhousingunit,theCensusBureaufirstimputedforthestatusoftheunit(occupied,vacant,nonexist-ent),then,ifoccupied,thehouseholdsizewas imputed.MethodologyTheCensusBureauusedthenearest-neighborhotdeckimputationmethodologytoperformthecountimputation. | |||
Nearestwasdefinedbythegeographicalclosenessofhousingunits.Groupquartersaddresseswereincludedinthemeasureofdistance,althoughnototherwiseinvolved incountimputation.Censusgeographicalidentifiers,such astractnumber,blocknumber,ormapspotnumber,alongwithstreetname,housenumberorapartmentnumberwereusedtodescribegeographicalproximityofhousing records.Toproperlyassignstatusandnumberofoccu-pantstothehousingunitsrequiringimputation,limiteddonorpoolsandexpandeddonorpoolsweredeveloped foreachimputationcategory,whichwerefurthersubdi-videdbytypeofstructure.Allcaseswithmissingstatus,occupancyorhouseholdsizewentthroughintensivefollow-upoperationstoreducetheamountofimputationasmuchaspossible. | |||
ThiswasthemainpurposeoftheNRFUandCIFUformailout/mailbackareasandenumeratorvisitinlist/enumerateorupdate/enumerateareas.Toproperly representthesecases(donees),theprimarydonorpools werealsohousingunitsfromNRFU,CIFUorfromother enumeratorvisitedcases.Inthedesignphase,theCensus Bureaudiddevelopastandbyproceduretoincludeallenu-merationsinanexpandeddonorpool.With99.5percent ofhousingunitshavingstatusandhouseholdsizeinfor-mationavailablefromdatacollectionactivities,the expandeddonorpoolswereneverused.Thechartbelow characterizestherelationshipbetweendoneesandthepri-marydonorpoolbyimputationcategory.DonorsandDoneesbyImputationCategory ImputationcategoriesDoneesDonorpoolHouseholdsize imputation:a.Singleunits b.MultiunitsOccupiedwithunknownhouseholdpopulationOccupiedunitswithknownpopulation(inNRFU,or CIFU,orfromlist/enumerate orupdate/enumerateareas) | |||
Occupancy imputation:a.Singleunitsb.MultiunitsUnitsknowntobeeitheroccupiedorvacantOccupiedunitswithknownpopulationorvacantunits (inNRFU,orCIFU,orfromlist/enumerateorupdate/enumerateareas)StatusImputation:a.Singleunitsb.MultiunitsUnitswithnostatus informationOccupiedunitswithknownpopulation,vacantunits,ornonexistentunits(in NRFU,orCIFU,orfrom list/enumerateorupdate/ | |||
enumerateareas)Ingeneral,typeofstructure(multiorsingle),typeofenu-meration(mailorlist/enumerate),andfinalstageofthedatacollectionforahousingunit(initialcollection,NRFU, orCIFU)determinedwhetherahousingrecordcouldbeusedasaprimarydonor.Eachavailabledonorcouldonlybeusedonce.Mostofthetime,thenearestpotential donorwasselectedasthedonor.Occasionally,asecond nearestneighborwasdesignatedasthedonor,becausethenearestdonorhadbeentakenbysomepreviouslypro-cesseddonee.Wheneverpossible,thedonoranddonee weretobeinthesametract,orinthesamemultiunitifthedoneewaslocatedinamultiunitbuilding.Toidentifythenearestdonor,asearchwasconductedinbothdirections:forwardandbackward.Usingthedonee asareferencepoint,potentialdonorssurroundingthedoneerecordweresearched,andthedonorhousingunitgeographicallyclosesttothedoneehousingunitwas determined.Thesearchwasdoneseparatelyforsingleunitsandmultiunits.A-2SectionIAppendixACensus2000MissingDataU.S.CensusBureau,Census2000 CHARACTERISTICIMPUTATIONCharacteristicimputationwastheprocessoffillinginmissingpersoncharacteristics,whichincludesex, age/dateofbirth,relationship,Hispanicoriginandrace. | |||
TheCensusBureauusedcharacteristicimputationfor threecategoriesofcasesinCensus2000. | |||
1.Wholehouseholdimputation.TheCensusBureauimputedallofthecharacteristicsforallofthepersons inthehouseholdwhenthehouseholdrecorddidnotcontainanydatadefinedpersons.Tobedata-defined,apersonrecordmustcontaintwoormoreofthe100-percentpopulationdataitems,oraname. | |||
2.Withinhouseholdimputation.TheCensusBureauimputedallthe100-percentcharacteristicsforany non-data-definedpersonsinthehouseholdwhenthehouseholdrecordcontainedatleastonedata-defined person.3.WithinPersonImputation.Sometimessomeofthe100-percentcharacteristicdatafordata-definedper-sonsweremissingandwereimputed. | |||
MethodologyThecategoriesofcharacteristicimputationemploydiffer-entmethodologies.Forwholehouseholdimputations,the processreplicatesallofthe100-percentpersondataitems bysubstitutingdatafromahotdecknearestneighbordonorpoolrecordofthesamehouseholdsize.Thispro-cessissometimesreferredtoassubstitution,sinceit assignsallthecharacteristicsforallofthepersonsintheselecteddonorhouseholdtothehouseholdrequiringimputation.Thissubstitutionprocessisalsousedto obtainthepersoncharacteristicsforthosehousingunits thatwereimputedasoccupiedorhadtheirhouseholdsizeimputedduringthecountimputationprocess.Bydefini-tionthesehouseholdsdonotcontainanydata-defined persons.However,themajorityofwholehouseholdimputationsoccurforcaseswhereacensusresponseonhouseholdsizewasobtained.Forwithinhouseholdimputationsaswellaswithinpersonimputations,theprocessallocatesmissingvaluesforindi-vidualpersoncharacteristicdataitemsonthebasisof otherreportedinformationforotherpersonsinthehouse-hold,orfromotherpersonsinhouseholdswithsimilar | |||
characteristics. | |||
RESULTSThissectionbrieflysummarizestheoveralllevelofimpu-tationforpeoplewhose100-percentcharacteristicswere totallyimputedinCensus2000(withinpersonimputa-tionsareexcluded)fortheU.S.populationresidinginhousingunits.Census2000HousingUnitPersonsbyImputationCategory(Excludeswithinpersonimputations)Numberof personsPercentoftotalpersonsTotalhousingunitpopulation. | |||
.........273,643,272100.00100-percentcharacteristicimputationnotrequired | |||
............267,869,00797.89100-percentcharacteristicimputationrequired... | |||
............5,774,2662.11Countimputations: | |||
...................1,172,1440.43Householdsize | |||
....................495,6000.18 Occupancy........................260,6520.10 Status............................415,8920.15Characteristicimputations | |||
.............4,602,1221.68Wholehousehold 1.................2,269,0100.83Withinhousehold | |||
..................2,333,1120.85 1Thecountimputationcases(alsorequiringcharacteristicimputa-tion)arenotincludedinthisfiguretoavoidduplication.About2percentofpersonsresidinginhousingunitsrequiredimputationsofall100-personcharacteristics.Themajorityofthesecases,about1.7percent,occurredin situationswhereacensusresponseonhouseholdsizewasobtained.Lessthanahalfofapercentweresituationswherehouseholdsizeorthestatusofthehousingunit wasunknown.SectionIAppendixAA-3Census2000MissingDataU.S.CensusBureau,Census2000 AppendixB.DemographicAnalysis INTRODUCTIONTheCensusBureauhasusedDemographicAnalysis(DA)tomeasurepopulationcoverage,trendsbetweencen-suses,anddifferencesincoveragebyage,sex,andrace (Black,non-Black)atthenationallevelineverycensussince1960(SiegelandZelnik(1966),Siegel(1974),Fayetal.(1988),andRobinsonetal.(1993)).DAproducesesti-matesoftheU.S.populationthroughtheuseofdatafromadministrativerecordsandothernoncensussources.Ithasdocumentedboththelong-termreductioninthecen-susnetundercountrateandthepersistentanddispropor-tionateundercountofcertaindemographicgroups,suchasBlackmen.OnegoalofCensus2000wastoreduce thesedifferentialundercounts,whichhasbeenacontinu-ingeffortforthelastseveralcensuses.Theindependencefromthecensusandinternalconsis-tencyoftheDAestimationprocessallowsustocompare theresultswiththesurvey-basedAccuracyandCoverageEvaluation(A.C.E.)coverageestimates;inparticular,theconsistencyoftheage-sexresultscanbeassessed.DA andA.C.E.useentirelydifferentmethodologies.Becausethesourcesandpatternsoferrorsinthetwoestimatesaresufficientlydifferent,anydisagreementintheresultscan shedlightonboththequalityofthecensusandpotential problemsinmethodologyintheA.C.E.ortheDA.Becauseofdatalimitations,DAestimatesandcomparisonsareonlypossibleatthenationallevelandforcertainlarge demographicgroups.AfurtherdiscussionofDAlimita-tionsisfoundinthesectionLimitationsofDAEstimatesofthisappendix.TheU.S.CensusBureaureleasedtwosetsofDAresultsaspartofitsevaluationofCensus2000andtheA.C.E.AllDAresultsinthissectionarefromtherevisedvaluesreleased inOctober2001.SeeRobinson(2001)fordetails.DESCRIPTIONOFTHEDEMOGRAPHICANALYSIS METHODDemographicAnalysisrepresentsamacro-levelapproachformeasuringcoverage.Estimatesofnetundercountareobtainedbycomparingcensuscountstoindependentesti-matesofthepopulationderivedfromothermeasures(mostlyadministrativedata).Ingeneral,DApopulationestimatesaredevelopedforthecensusdatebycombining varioustypesofdemographicdatathatareindependentofthecensusandarehighlyreliable,suchasadministra-tivestatisticsonbirths,deaths,andMedicaredataandestimatesofimmigrationandemigration.Thedifference betweentheDAestimatedpopulation(P)andthecensuscount(C)providesanestimateofthenetcensusunder-count(u).DividingthenetundercountbytheDAbench-markprovidesanestimateofthenetundercountrate(r): | |||
uPC ruP100Theparticularanalyticprocedureusedtoestimatecover-agenationallyforthevariousdemographicsubgroupsdependsprimarilyonthenatureandavailabilityoftherequireddemographicdata.Twodifferentdemographic techniqueswereusedtoproducethedemographicanaly-sisestimatesfor2000,oneforthepopulationunderage65andanotherforthepopulation65andover.Agesunder65.TheDemographicAnalysisestimatesforthepopulationbelowage65arebasedonthecompilationofhistoricalestimatesofthecomponentsofpopulation change:birthssince1935(B),deathstopersonsbornsince1935(D),immigrantsbornsince1935(I),andemi-grantsbornsince1935(E).Presumingthatthecompo-nentsareaccuratelymeasured,thepopulationestimates (P 0-64)arederivedbythebasicdemographicaccountingequationappliedtoeachbirthcohort: | |||
P 064BDIEThesizeofthecomponentestimatesusedtodeveloptheDApopulationunderage65for2000isshownin TableB-1:TableB-1.DAEstimatesoftheComponentsofChangefortheU.S.Resident Population:April1,2000ComponentEstimateTotalpopulation | |||
................................. | |||
281,759,858Underage65in2000+Birthssince1935(B) | |||
....................... | |||
234,860,298-Deathstopersonsbornsince1935(D) | |||
.......14,766,736+Immigrationofpersonsbornsince1935(I) | |||
....32,563,971-Emigrationofpersonsbornsince1935(E) | |||
....5,485,117Ages65andoverin2000Medicare-basedpopulation | |||
.................... | |||
34,587,440SectionIAppendixBB-1DemographicAnalysisU.S.CensusBureau,Census2000 Clearly,births(234.9million)representbyfarthelargestcomponent.Theimmigrationcomponent(32.6million)is secondlargest,followedbydeaths(14.8million)andemi-grants(5.5million).Theactualcalculationsarecarriedoutforsingle-yearbirthcohorts.Forexample,theestimateofthepopulationage40onApril1,2000isbasedonbirthsfromApril1959toMarch1960(adjustedforunder-registration),reducedby deathstothecohortineachyearbetween1959and2000,andincrementedbyestimatedimmigrationandemigra-tionofthecohortoverthe40-yearperiod.Thecomponentsforbirthsanddeathsarecompiledprinci-pallyfromvitalstatisticsrecordsaugmentedbycorrectionfactors.Theimmigrationcomponentisestimatedfromits | |||
subcomponents:TableB-2.DAEstimatesoftheComponentsofImmigrationfortheU.S.Resident PopulationUnder65YearsofAge: | |||
April1,2000ComponentEstimateLegallyadmittedpermanentresidents | |||
............. | |||
20,332,038Othermeasuredmigration | |||
....................... | |||
2,249,001MigrantsfromPuertoRico.. | |||
................... | |||
905,698Temporarymigrants | |||
........................... | |||
776,002Civiliancitizenmigration | |||
....................... | |||
891,940ArmedForcesoverseas. | |||
...................... | |||
-324,639Residualforeign-bornmigration(includesunautho-rizedmigrants) | |||
.................................. | |||
9,982,932Age65andover.AdministrativedataonaggregateMedicareenrollmentsareusedtoestimatethepopulation age65andover(P 65+): P 65MmwhereMistheaggregateMedicareenrollmentandmistheestimateofunderenrollmentinMedicare.TheDApopulation65andoverisbasedon2000Medicareenroll-ments.Medicareisanadministrativedatasetfromthe HealthCareFinancingAdministration.AlthoughMedicareenrollmentisgenerallypresumedtobequitecomplete,adjustmentsaremadetothebasicdatatoaccountfor individualswhoareomitted.Anallowanceismadefortheestimated1.3millionnotenrolled(3.9percent).Underenrollmentfactorsarebasedonsurveyestimatesof MedicarecoverageanddataonageatenrollmentintheMedicarefile.TheDApopulationaged65andover(34.6million)represents12.3percentofthetotalpopulationin | |||
2000.Thedemographiccomponentestimatesforthepopulationunder65arecombinedwiththeMedicare-basedestimate forthepopulation65andovertoproducethetotalDA populationestimateof281.8millionasofApril1,2000.LIMITATIONSOFDAESTIMATESDAestimatesforthetotalpopulationareavailableonlyatthenationallevelandonlyforthebroadcategoriesBlackandnon-Black.DAcannotprovideestimatesforsub-nationalgeographicareaslikestatesormetropolitanareas;orforotherdemographicgroups,suchasHispan-ics.DAalsocannotprovideseparateestimatesforcensusovercoverageandundercoverage,butislimitedtoesti-matingnetundercount.TherearealsocertaininherentlimitationsonDAestimatesbecauseofdataquality.Theracecategoriesreflecttheraceasassignedatthetimeoftheevent(e.g.birthor Medicareenrollment),whichforsomepersonswilldifferfromtheracereportedinthecensus.Thereisalsoconsid-erableuncertaintyinthequalityofthedataforsomeofthecomponentsrelatedtoimmigration,mostimportantlythecomponentswhichcapturethosewhoenteredillegally ortemporarily,orwhoselegalstatushadnotyetbeen determined.DAESTIMATESComparedtotheCensus2000countof281.4million,theDAestimateof281.8millionimpliesanetcensusunder-countof0.12percent(seeTableB-3).Thenetcensus undercountin2000wasdramaticallydifferentfromthatin1990,whichwas4.2million,or1.65percent.However,thefactthatDAprovidesonlyanetundercountestimate, notseparatemeasuresofgrossundercountandover-count,isalimitationonitsabilitytoshedlightonspecificundercoverageorovercoverageproblemsinthecensus.TableB-3.DemographicAnalysisEstimateandNetCensusUndercountforthe TotalPopulation:1990and2000 Category 1990 Census 2000 CensusDA(millions) | |||
..........................252.9281.8DifferencefromCensus................4.20.3Percentdifference | |||
.....................1.650.12B-2SectionIAppendixBDemographicAnalysisU.S.CensusBureau,Census2000 TheDAestimatesindicatethatthesubstantialreductioninnetcensusundercountfrom1990to2000wassharedby almostalldemographicgroups.Thenetcensusunder-countofmalesandfemaleseachfellbyabout1.5percent-agepoints(toanestimatednetcensusundercountof0.86 percentformalesandestimatednetcensusovercountof 0.60percentforfemalesin2000).Theestimatednet undercountratedroppedmoreforBlacks(estimatednet censusundercountof2.78percentin2000)thannon-Blacks(estimatednetcensusovercountof0.29percentin 2000),reducingthedifferentialundercountofBlacksrela-tivetonon-Blacksfrom4.4percentagepointsin1990to 3.1pointsin2000.TableB-4.DemographicAnalysisEstimatesofPercentNetCensusUndercountfor theTotalPopulationandSelected DemographicGroups:1990and | |||
2000Category1990DA2000DATotal..................................1.650.12 Male.................................2.390.86 Female...............................0.93-0.60 Black................................5.522.78Non-Black.. | |||
..........................1.08-0.29Blackmale,ages20-64................11.318.44Children,ages0-4 | |||
.....................3.723.84(aminussigndenotesanetovercount)SectionIAppendixBB-3DemographicAnalysisU.S.CensusBureau,Census2000 AppendixC.WeightTrimming INTRODUCTIONThisappendixcontainsageneraloverviewoftheAccu-racyandCoverageEvaluation(A.C.E.)weighttrimmingplan.Theprocedurewasdesignedtoprotectagainst undueinfluencefromasmallfractionofthesample.Theweighttrimmingcriteriawereestablishedpriortothecompletionofdataprocessingoperationstoensurethat therewasnomanipulationofthedualsystemestimates.Thisprocedurewasimplementedaccordingtothepre-specifiedcriteria.Sinceonlyoneclusterwastrimmed,the impactonthedualsystemestimateswasveryminimal.TheA.C.E.weighttrimmingprocedurewasdesignedtoreducethesamplingweightsforclustersthatpotentially couldhavehadanextremeinfluenceonthedualsystemestimatesandvariances.Themeasureofclusterinfluencewasthenetclustererror,theabsolutedifferencebetween theweightedestimateofnonmatchesandtheweighted estimateoferroneousenumerations.Whentheneterrorexceededapre-setmaximumvalue,thesamplingweightswerereduced.Thisapproachreducedvarianceandmay haveintroducedsomebias,butitishighlylikelytohavereducedthemeansquareerrorformostitems.Iftheneterroroftheclusterdidnotexceedthepre-setmaximum value,thesamplingweightswereunchanged.TheneterrorcriteriawasexaminedaftertheA.C.E.personmatchingoperationwascompleted.Ifthecriteriafor weighttrimmingwasmet,itwasdoneforallsamplecases inaclustereventhoughaclustercontributedsampletomultiplepost-strata.Thiswasdonepriortothemissingdataprocess. | |||
BACKGROUNDWeighttrimmingguardsagainstthepossibilityofacertainsmallnumberofclustersexertinganundueinfluenceonpost-stratumestimatesandvariances.IntheA.C.E.,theseareexpectedtobeduetoadisproportionatenumberof censusnonmatchesorcensuserroneousenumerations withinafewblockclusters.Althoughextremesamplingweightscanalsobeasourceofinfluenceinsurveys,theA.C.E.samplingweights,theinverseoftheprobabilityof selection,werereasonablycontrolledinthesampledesign.ThesewerenotexpectedtobeanimportantsourceofvarianceintheA.C.E.WhiletheA.C.E.sampledesignhelpedminimizetheoccur-renceofhighlyinfluentialclusters,aweighttrimmingplanwasdevelopedtoreducetheeffectofthesepotentialextremeclusters.TheA.C.E.weighttrimmingplanwasamodificationofthemethodusedforthe1990Post-EnumerationSurvey(PES).Asin1990,theweightsfor extremelyinfluentialclustersweretrimmedtoyieldapre-specifiedneterror.Theintentionoftheplanwastolessen theimpactofsuchclustersonthedualsystemestimates andvariances.Theplandidnotredistributetheweightsacrosstheremainingclusterstopreservetotals.Thiswouldimply treatingtheEandPsamplesdifferentlytopreservethese separatetotals,andcontradictsthepreferenceforconsis-tenttreatmentsofbothsamples.Sincetheprimaryinter-estwasinthedualsystemestimationratios,andnotE-andP-sampletotals,theweightswerenotredistributed.A.C.E.WEIGHTTRIMMINGMETHODOLOGYEachclusterwasevaluatedtodetermineifitcontributeddisproportionatelytothedualsystemestimatesandvari-ances.Iftheclusterwasanoutlier,theclustersampling weightwasmultipliedbyafactortodecreasetheinflu-enceoftheclusteronthedualsystemestimatesandvari- | |||
ances.IdentifyOutlierClustersAmeasureoftheclusterinfluencewascalculatedforeachcluster.Then,basedonpre-setcriteria,adecision wasmadewhethertheclustershouldbeidentifiedasan outlier.ClusterInfluence.Themeasureofclusterinfluencewastheneterror.Forpurposesofweighttrimming,thenet errorwastheabsolutedifferencebetweentheweighted numberofnonmatchesandtheweightednumberoferro-neousenumerations.Theformoftheweightedneterror | |||
was Z il(P i-M i)-(E i-CE i)l(1)where Z itheneterrorestimateforclusteri, P itheweightedP-samplepopulationestimateforclusteri, M itheweightedP-samplematchestimateforclusteri, E itheweightedE-samplepopulationestimateforclusteri,and CE itheweightedE-samplecorrectenumerationestimateforclusteri.Thefirsttermofequation(1)wastheweightednumberofnonmatchesinthei thcluster,whilethesecondtermwastheweightednumberoferroneousenumerationsinthei thcluster.SectionIAppendixCC-1WeightTrimmingU.S.CensusBureau,Census2000 OutlierCriteria.Theoutliercriterionwasthemaximumallowableneterrorforasinglecluster.Thereweretwodif-ferentcriteriabasedontheclustergeography.Thenation wasclassifiedintotwolevelsofgeography:American IndianReservationsandthebalanceofthenation.The AmericanIndianReservationclustersweresampledatdis-proportionatelyhigherratesrelativetothebalanceofthe country.Inaddition,separateAmericanIndianonAmeri-canIndianReservationpost-stratumestimateswere planned.IftheAmericanIndianReservationclusterswere includedwiththerestofthenation,itisunlikelythatan influentialclusterwouldbedetected.Thetwooutliercrite-riaaredefinedinTableC-1.TableC-1.OutlierClusterCriteriaClustergeography MaximumneterrorAmericanIndianReservations | |||
.................... | |||
6,250BalanceoftheUnitedStates | |||
..................... | |||
75,000Allclusterswithneterrorgreaterthanthemaximumallowableneterrorwereconsideredinfluentialclusters. | |||
Theywereexpectedtodisproportionatelyinfluencethedualsystemestimatesandvariances.Thesamplingweightsoftheseclustersweredecreased.Themaximumneterrorforthebalanceofthecountrywasbasedonexperienceinthe1990PES.SincetheA.C.E.was roughlydoublethePESsamplesize,themaximumallow-ableneterrorwassettobehalfthe1990value.FortheAmericanIndianReservationclusters,themaximumallow-ablevaluewasafunctionoftheaveragesamplingrates.TheAmericanIndianReservationaverageP-sampleclustersamplingweightwasapproximatelyone-twelfththebal-anceoftheU.S.averageP-sampleclustersampling weight.Becauseofthis,theAmericanIndianReservationmaximumallowableneterrorwasone-twelfththebalanceoftheU.S.criteria.ImplementationStrategy.Theoutlierclusterswereidentifiedafterthepersonmatchingoperation(Chapter4)wascompleted,butbeforethemissingdataprocess (Chapter6).Thepersonmatchingresultswerethemajorinputintothisprocess.Theweighttrimmingestimateusedthebestestimateofclusterneterroratthattimethat wasoperationallyfeasible.Thistiminghadseveralimpli-cations:*Onlynonmoversandoutmoverswereusedforderivingtheestimateofnonmatchesabove.Fordualsystemesti-mation,ifthenumberofoutmoversinapost-stratumwaslessthan10thenonlythenon-moversandoutmoverswereused.Becauseofthesmallnumber ofmoversexpectedinmostclusters,thisprocess onlyusednonmoversandoutmovers.*Somenonmoversandoutmovershadunresolvedmatchstatusandresidencestatus.SomeE-samplecaseshadunresolvedenumerationstatus.Thismeantthestatusof unresolvedcaseshadtobeestimatedtoidentifyoutlierclusters.Informationavailableatthetimeoftheweighttrimmingprocesswasusedtoapproximatelyestimate theunresolvedstatuscases.Sincetheweighttrimmingprocesswasdonebeforethemissingdataprocess,therewassomeinformationthatthemissingdatapro-cessusedtoestimateunresolvedstatusthatwasnotyetavailable.*AP-samplenoninterviewadjustmentwasapproximatedintheestimateofnonmatches.Informationavailableduringtheweighttrimmingprocesswasusedto approximatelyestimatethenoninterviewadjustmentfor eachcluster.Aswiththeunresolvedcases,sincetheweighttrimmingprocesswasdonebeforethemissingdataprocess,therewassomeinformationthatthemiss-ingdataprocessusedtodothenoninterviewadjust-mentthatwasnotyetavailable.*Thetargetedextendedsearchresultsandsamplingrateswerereflectedintheestimateofnonmatchesanderroneousenumerations(Chapter5).Down-WeightingOutlierClusterAlloutlierclustersweredown-weighted,sothatnoclustercontributedmorethanthemaximumallowablenumberof neterrorsfortheappropriategeography.Aseparatedown-weightingfactorwascomputedforeachoutliercluster.Thedown-weightingfactorwastheratioofthe outlierclustercriteriatotheclusterneterrorcomputed above.D iC Z i (2)where D ithedown-weightingfactorforclusteri,and CthemaximumneterrorfromTableC-1fortheappropriatelevelofgeography,and Z itheneterrorestimateforclusterifrom(1).Theclusterdown-weightingfactorwasappliedtotheP-sampleandtheE-sampleweightsoftheoutliercluster. | |||
TheP-sampleandE-sampleweightsfortheremainingclusterswereunchanged.C-2SectionIAppendixCWeightTrimmingU.S.CensusBureau,Census2000 A.C.E.WEIGHTTRIMMINGRESULTSTableC-2showstheoneclusterdown-weightedbytheweighttrimmingprocessinthebalanceoftheUnited States.NoclustersweretrimmedonAmericanIndian | |||
Reservations.FiguresC-1andC-2showthedistributionsofneterrorbeforeweighttrimmingforthebalanceoftheUnited StatesandtheAmericanIndianReservationareas.TableC-2.A.C.E.WeightedNetErrorsforDown-WeightedClusterGeographicareaBeforetrimming After trimming Estimated erroneous enumerations Estimated weighted nonmatches Estimated weightedneterror Estimated weightedneterrorBalanceoftheUnitedStates | |||
......79,3711,39677,97575,000SectionIAppendixCC-3WeightTrimmingU.S.CensusBureau,Census2000 Figure C-1. | |||
Distribution of Net Error for the Balance of the United StatesU.S. Census Bureau, Census 2000 C-4 Section IAppendix C Weight Trimming (Number of clusters) 2,461 7,610 510 133 45 24 14 12320020001010,00020,00030,00040,00050,00060,00070,00080,000Net error before trimming Figure C-2. | |||
Distribution of Net Error for American Indian ReservationsU.S. Census Bureau, Census 2000 Section IAppendix C C-5Weight Trimming (Number of clusters) 186 262 510 29 3 2 1 11100000001,0002,0003,0004,0005,0006,0007,000Net error before trimming AppendixD.ErrorProfileforA.C.E.Estimates INTRODUCTIONTheAccuracyandCoverageEvaluation(A.C.E.)surveypro-videdestimatesofcensuscoverageerrorthathavebeen consideredforadjustingCensus2000.TheestimationusedthePES-CversionofdualsystemestimationwiththedatacollectedbytheA.C.E.Theadjustedestimatesare subjecttononsamplingerror,aswellassamplingerror.ThisappendixdiscussesthetypesoferrorsfoundintheuseofPES-Candthemeasurementoftheseerrors.OVERVIEWOFADJUSTEDESTIMATESDefinethefollowingnotationforeachpost-stratumh. | |||
C hcensuscountforpost-stratumh II hnumberofpersonsimputedintotheoriginalenumerationforpost-stratumh IE, hestimatednumberofenumerationsinpost-stratumhwithinsufficientinformationformatching 1 EE, hestimatednumberoferroneousenumerationsinpost-stratumh NE, hestimatedpopulationsizeforpost-stratumhfromtheEsample CEhestimatedpopulationsizeforpost-stratumhwhocouldpossiblybematched CEhNE, hIE, hEE, h NP, hestimatedsizeoftheP-samplepopulation MhestimatednumberoftheP-samplepopulationenu-meratedinthecensusThedualsystemestimatorforthepopulationsize N h inpost-stratumhisdefinedby NhC hII h)(CEhNE, h)(NP, hMh).The2000A.C.E.usedthePES-Cformulationofthedualsystemestimatorwhichusesthenumberofinmoverstoestimatethenumberofoutmovers,butusesthematchratefortheoutmoverstoobtaintheestimateofthenum-berofoutmoversthatmatchthecensus.Thepost-stratumindexhissuppressedinthefollowingformula.(NPM)NnNiMn(MoNo)Ni).where Nnestimatednumberofnonmovers Noestimatednumberofoutmovers Niestimatednumberofinmovers Mnestimatednumbernonmoversenumeratedinthe census Moestimatednumberoutmoversenumeratedinthe censusWhenapost-stratumhadfewerthan10outmovers,thePES-Aversionofthedualsystemestimatorthatdoesnotuseinmoverswasemployedasfollows: (NP/M)(NnNo)/(Mn+Mo)Theadjustmentfactorforpost-stratumhisdefinedas AhNhC h.Theunadjustedestimateforareajis Nunadj,jhCh,jandtheadjustedestimateis Nadj,jhAh Ch,j.Theestimatesofundercountinthepopulationsizeofareajis Nadj,jNunadj,jandtheestimateofthecorre-spondingundercountrateisNadj,jNunadj,jNadj,j.SOURCESOFERRORINADJUSTMENTSTheadjustedestimatesaresubjecttoavarietyofpossiblesourcesoferror:samplingerror,datacollectionandsur-veyoperationserror,missingdata,errorfromexclusionoflatecensusdataanddatawithinsufficientinformationfor matching,contaminationerror,correlationbias,syntheticestimationbias,inconsistentpost-stratification,andbal-ancingerror.P-SampleMatchingErrorandE-SampleProcessing Error Source.ThetermP-samplematchinghasbeenusedtodescribethesearchofthecensusrecordsforenumera-tionsforP-samplerespondents.TheP-samplerespondents aredesignatedasmatchinganenumerationinthecensus orasnotenumerated.ThecounterpartfortheEsampleiscalledE-sampleprocessingwherecensusenumerationsaredesignatedascorrectlyenumeratedorerroneously enumerated.WhenthestatusofaP-sampleorE-samplecasecannotbedetermined,itisdesignatedasunre-solved.P-samplematchingerrorreferstotheneteffectoferrorsthatoccurduringtheprocessingthataffectthedeterminationwhetheraP-samplepersonmatchesacen-susenumeration.Likewise,theneteffectoferrorsin assigningenumerationstatustoE-sampleenumerationsduringtheofficeprocessingiscalledE-sampleprocessingerror.1Lateenumerationsareincludedwithimputationsintheoriginalenumeration.SectionIAppendixDD-1ErrorProfileforA.C.E.EstimatesU.S.CensusBureau,Census2000 Errorsmayoccurineitherdirection.P-samplepeoplemaybedesignatedasmatchingacensusenumeration althoughtheyarenotinthecensus,calledafalsematch, orpeoplemaybedesignatedasnotenumeratedalthough theyare,calledafalsenonmatch.E-sampleenumera-tionsmaybefalselyassignedacorrectenumerationsta-tus,calledafalsecorrectenumeration,orenumerations maybeincorrectlydesignatedasanerroneousenumera-tion,afalseerroneousenumeration.MatchingerroralsoencompasseserrorsinthesizeoftheP-samplepopulationthatmayhappenduringtheprocess-ingoftheP-sample.Theseerrorsalsomayoccurineitherdirection.Apersonincludedasamemberofahousehold mayreallyresideatanotherlocationornotbeinthepopulationofinterest.Forexample,thecensusresidencyrulesconsiderfamilymembersawayatcollegetoresideattheircollegeaddress.Afamilymemberinanursingcenter isconsideredtobeinthegroupquarterspopulation,whichisnotpartofthepopulationofinterest.Viceversa,apersonwithtwohomes,maybedesignatedaslivingat theotherhome,butreallyliveattheoneinthesample.IntheapplicationofPES-C,respondentshavethepotentialofmanymorestatusesthanwaspossibleintheprocess-ingoftheP-samplethanin1990.ThereasonisthataP-samplerespondentmaybeanonmover,anoutmover,aninmover,oranout-of-scopeperson.Thenonmoversand outmovershaveanothercharacteristic,whichisresidentornonresident.ApersonwhoislivingatthesampleaddressonCensusDayiscalledaresident.Errorsinmoverstatusmaygoinalldirections.Apersondesignatedasanonmovermaybeaninmoveroranout-mover.Allcombinationsoferrorsmayhappenandaffect theDSEindifferentways. | |||
Definition.P-samplematchingerroraffectsboththeestimatesofnonmoversandinmoversintheestimateof thesizeoftheP-samplepopulation.Inaddition,matchingerroraffectstheestimatesofthenumberofnonmovermatches,thenumberofoutmoversandoutmover matches,andthenumberofinmoversintheestimateofthenumberofmatches.E-sampledatacollectionerroraffectstheestimateofthenumberoferroneousenumera-tions.Thepost-stratumindexhissuppressedinthefol-lowingdefinitions. | |||
m nmsnetP-samplematchingerrorinthenonmovercom-ponentof Mm omimsnetP-samplematchingerrorintheoutmoverandinmovercomponentof Mn nmsnetP-samplematchingerrorinthenonmovercom-ponentof NP n imsnetP-samplematchingerrorintheinmovercom-ponentof NP ee snetE-sampleofficeprocessingerrorinCE Undertheassumptionthatallothererrorsarezero,thebiasintheadjustmentfactorscausedbyP-samplematch-ingerrorandE-sampleprocessingerrorisdefinedas Bprocess , hAhC hII h C hCEhBCEprocess , h NE, h[NP, hBPprocess , h]MhBMprocess , h.Theerrorcomponentdefinitionsincludearatioadjust-mentbecausetheyareestimatedusingtheEvaluationSample.Theratioadjustmentforcomponentsfromthe P-sampleistheratiooftheP-samplepopulationtotalfrom theA.C.E.totheP-samplepopulationtotalbasedontheEvaluationSample NPF.Theratioadjustmentforthecom-ponentsfromtheE-SampleisratioofthetwoE-sample totalsdefinedcomparably. | |||
BMprocess[m nmsm omims][NPNPF]BPprocess[n nms][NPNPF]BCE-process[ee s][NENEF]P-SampleandE-SampleDataCollectionError Source.Errorsmayoccurduringthedatacollection.Whileaninterviewisinprogress,therespondentmaymakeanerrorinansweringaquestion,ortheinterviewermaymakeanerrorinaskingaquestionorrecordingthe answer.Errorsalsooccurwhenaninterviewergoestothewrongaddress.Regardlessofwhethertheerroriscausedbytherespondent,theinterviewer,oracombinationof thetwo,sucherrorsmaycausethematchingoperationto assignmoverstatus,residencystatus,ormatchstatusincorrectlytoapersononthehouseholdroster.TheA.C.E.interviewercollectsbothaCensusDayrosterandanInter-viewDayroster.Apersonwhoresidesatthehouseholdonbothdaysisclassifiedasanonmover.ApersonwholivedthereonlyonCensusDayisanoutmover,whileaperson wholivedthereonlyonInterviewDayisaninmover.Per-sonsclassifiedasoutmoversandnonmoversmayormaynothavebeenaresidentattheaddressonCensusDay.Errorsinthemoverstatus,residencystatus,orother errorsmaycausethematchingoperationtofailtodeter-minethatapersonwasenumeratedandtoclassifythepersonasanonmatchincorrectly.Sometimespeoplelistedonhouseholdrostersdonotexist.Amorelikelyscenarioisaninterviewerwhoishav-ingtroublecontactingtheresidentsofahousingunits maycopythenamefromamailboxandfillinthecharac-teristics.ThistypeoferroriscalledP-samplefabrication.Usuallyfabricatedhouseholdscauseanunderestimateof thematchrate,becausetheyaresmallerthantheaverage householdsizeanddonotmatch,AspecialtypeofE-sampledatacollectionerroristhefail-uretoidentifyduplicateenumerations.Theprocessing includesasearchforduplicateenumerationswithintheblockclusterandthesurroundingblocks.Duplicateenu-merationsoutsidetheblockclusterandsurroundingD-2SectionIAppendixDErrorProfileforA.C.E.EstimatesU.S.CensusBureau,Census2000 blocksaremoredifficulttofind.Identifyingtheseduplica-tionsrequirestherespondenttoprovideinformationcon-cerninganotheraddresswhereahouseholdmembermayalsobeenumerated.Errorsmayoccurwhentherespon-dentdoesnotunderstandtheresidencyrulesoris unawarethatahouseholdmembermaybeenumeratedat anotheraddress.Thesituationsmostpronetocausing duplicateenumerationsarecollegestudentsenumerated attheirfamilyhomeandtheircollegeaddress,childrenin jointcustodyagreementsenumeratedatbothparents addresses,andpeoplewithtworesidences.AnothertypeoffielderroroccursduringthelistingofthehousingunitsforthecensusorfortheP-sample.Thehousingunitslistedasbeinginthesampleblockmaybeinanotherblockorviceversa.Thesetypesoferrorsare calledgeocodingerror.Toaccountforminorgeocodingerrorsin2000,thesearchformatchesoccurredwithinallblock-clustersandalsoinsurroundingblocksforasample ofthecaseswithgeocodingerrorsrecordedintheE sampleadesigncalledTargetedExtendedSearch(TES).ThevarianceestimatesfortheA.C.E.accountfortheTESdesign.FlawsintheexecutionoftheTESmayresultin | |||
biases.Definition.P-samplefabricationanddatacollectionerroraffectboththeestimatesofnonmoversandinmoversin theestimateofthesizeoftheP-samplepopulation.Inaddition,fabricationanddatacollectionerroraffecttheestimatesofthenumberofnonmovermatches,thenum-berofoutmoversandoutmovermatches,andthenumberofinmoversintheestimateofthenumberofmatches.E-sampledatacollectionerroraffectstheestimateofthe numberoferroneousenumerations.Again,thepost-stratumindexhissuppressedinthefollowingdefinitions. | |||
m nmrnetP-sampledatacollectionerrorinthenon-movercomponentof Mm omimrnetP-sampledatacollectionerrorintheoutmoverandinmovercomponentof Mn nmrnetP-sampledatacollectionerrorinthenonmovercomponentof NP n imrnetP-sampledatacollectionerrorintheinmovercomponentof NP ee rnetE-sampledatacollectionerrorin CEm nmfpnetP-samplefabricationerrorinthenonmovercomponentof Mm omimfpnetP-samplefabricationerrorintheoutmoverandinmovercomponentof Mn nmfpnetP-sampledatacollectionerrorinthenonmovercomponentof NP n imfpnetP-sampledatacollectionerrorintheinmovercomponentof NPUndertheassumptionthatallothererrorsarezero,thebiasintheadjustmentfactorscausedbyP-sampledata collectionerrorandE-sampledatacollectionerrorisdefinedas Bcollect , hAhC hII h C hCEhBCEcollect , h NE, h[NP, hBPcollect , h]MhBmcollect , h.Theerrorcomponentdefinitionsincludearatioadjust-mentbecausetheyareestimatedusingtheEvaluation Sample.TheratioadjustmentforcomponentsfromtheP-sampleistheratiooftheP-samplepopulationtotalfromtheA.C.E.totheP-samplepopulationtotalbasedontheEvaluationSample.Theratioadjustmentforthecompo-nentsfromtheE-SampleisratioofthetwoE-sampletotalsdefinedcomparably. | |||
BMcollect[m nmrm nmfpm omimrm omimfp][NPPF]BPcollect[n nmrn nmfpn im][N PNPF]BCEcollect[ee r][NENEF]MissingData Source.A.C.E.datamaybemissingforavarietyofreasonssomeA.C.E.interviewsfailtotakeplace,somehouseholdsprovideincompletedataonquestionnaire items,andinsomecasestheinformationforclassificationasamatchornonmatchisambiguous.Themethodsusedtocompensateformissingdataeffectivelyassumethat thematchstatusforthecasewithmissingdataisequal onaveragetothestatusforcasesthataresimilar,exceptthattheyhavecompletedata.Missingdataoncharacteris-ticsareimputedfromotherwisesimilarcaseswithcom-pletedata.Nonresponseweightingadjustmentsareusedtoaccountforsampledbutnoninterviewedhouseholds.TheP-samplematchingandE-sampleprocessingoperation assignsunresolvedenumerationstatustoacasewhen theavailabledataisinadequatetodeterminewhetherthepersonisenumeratedinthecensusandaprobabilityofbeingcorrectlyenumeratedisimputedforsuchcases.Also,errorintheresolvedcasescauseserrorintheimpu-tations,becausetheresolvedcasesareusedtoformtheimputations.Eveniftheimputationmodelwereperfect,theimputationswillhaveerrorifthedatausedtofitthe modelhaserror.Thistypeoferroriscalledimputationerrorduetodataerror.AlthoughonecanconsidertherangeofeffectsontheDSEbyconsideringextremealternativese.g.,allunresolvedmatchestrulyarematchesortrulyarenonmatchesthe rangeistoowidetobeinformativeaboutthelikelybias. | |||
Thebiasfromthemethodusedtocompensateformissingdatacaninprinciplebeestimatedfromintensivefollow-upofcaseswithmissingdata,butinpracticethe fractioncompletedbyfollow-upistoolow.TheCensusBureauanalyzedthemissing-databiasbylookingatthechangesintheDSEwhenalternativemethodswereused tocompensateformissingdata.ResultsfromtheAnalysisofReasonableAlternativeImpu-tationModelsareusedtoestimatethevariancecompo-nent.SeeKeathleyetal.(2001)fordetails.TheresultsofSectionIAppendixDD-3ErrorProfileforA.C.E.EstimatesU.S.CensusBureau,Census2000 ReasonableAlternativeImputationModelsprovidedthedataforthecalculationofthevariance-covariancematrix foradjustmentfactorsthemissingdatacomponent.The missingdatavariance-covariancematrixwasaddedtothe samplingvariance-covariancematrixtoobtainavariance-covariancematrixfortheadjustmentfactorsthatcon-tainedtherandomerrorduetosamplingandimputation formissingdata. | |||
Definition.TheCensusBureaumodelstheerrorduetoimputationasarandomeffectandestimatesitsvariance-covariancematrix.Modelingimputationerrorasarandomeffectismotivatedbypracticalities.Inprinciple,thebiasfromthemethodusedtocompensateformissingdatacan beestimatedfromintensivefollow-upofcaseswithmiss-ingdata,butinpracticethefractioncompletedbyfollow-upistoolow.Thevariancecomponentdueto imputationformissingdatahasthreecomponents. | |||
V MvarianceduetoimputationV RAV BV I where V RAvarianceduetotheimputationmodelselection V Bvarianceduetothemodelparameterestimation V Iwithin-personimputationvariance.Theimputationvariancecomponentsduetoparameterestimationandwithinpersonestimationareincludedinthesamplingerrorestimates,leavingthevariancedueto modelselectiontobeestimatedseparately.Themissingdatavariance-covariancematrixisaddedtothesamplingvariance-covariancematrixtoobtainavariance-covariance matrixfortheadjustmentfactorsthatcontainedtheran-domerrorduetosamplingandimputationformissing data.Thecomponentsofimputationerrorduetodataerroraffectestimateofthenumberofnonmovers,theestimateofthenumberofnonmoversenumerated,theestimateof thematchratefortheoutmovers,andtheestimateofthenumberoferroneousenumerations.Thepost-stratumindexhissuppressedinthefollowingdefinitions. | |||
m nminetimputationerrorduetodataerrorinthenonmovercomponentof Mm ominetimputationerrorduetodataerrorintheoutmovermatchratecomponentof Mn nminetimputationerrorduetodataerrorinthenonmovercomponentof NP ee inetimputationerrorduetodataerrorin CE.Undertheassumptionthatallothererrorsarezero,thebiasintheadjustmentfactor Ahcausedbyimputationerrorduetodataerrorisdefinedas Bimpdata , hAhC hII h C hCEhBCEimpdata , h NE, h[NP, hBPimpdata , h]MhBMimpdata , hTheerrorcomponentdefinitionsincludearatioadjust-mentbecausetheyareestimatedusingtheEvaluationSample.TheratioadjustmentforcomponentsfromtheP-sampleistheratiooftheP-samplepopulationtotalfrom theA.C.E.totheP-samplepopulationtotalbasedonthe EvaluationSample.Theratioadjustmentforthecompo-nentsfromtheESampleistheratioofthetwoE-sample totalsdefinedcomparably. | |||
BMimpdata[m nmim omi][NPNPF]BPimpdata[n nmi][NPNPF]BCEimpdata[ee i][NENEF]SamplingError | |||
Source.Samplingerrorgivesrisetorandomerror,quan-tifiedbysamplingvariance,andtoasystematicerror knownasratio-estimatorbias.Thesamplingvarianceispresentinanyestimatebasedonasampleinsteadofthewholepopulation.Ratio-estimatorbiasarisesbecause evenifXandYareunbiasedestimators,X/Ytypicallyis biased.Definition.Thesamplingvarianceandratio-estimatorbiasfortheadjustmentfactorsare S 2samplingvariance-covariancematrixfortheadjustmentfactors Bratio , hratio-estimatorbiasintheadjustmentfactor Ahforpost-stratumh.Randomsamplingerrorisreflectedintheestimatedvariance-covariancematrixofthe Ahs.ThecovariancematrixisestimatedbytheCensusBureaussampling-errorsoftwareappliedtotheA.C.E.data.Thesoftwarealsocanbeusedtoproduceestimatesofratio-estimatorbias.CorrelationBias Source.Ifthereisvariabilityoftheenumerationprob-abilitiesforpersonsinthesamepost-stratum,orifthere isadependencebetweenenumerationinthecensusandintheA.C.E.e.g.,peoplelesslikelytobeenumeratedinthecensusmayalsobelesslikelytobefoundinthe A.C.E.,thencorrelationbiasmayarise.CorrelationbiasismostlikelyasourceofdownwardbiasintheDSE.Evi-denceofcorrelationbiasinnationalestimatesisprovided bysexratios(malestofemales)foradjustednumbersthat arelowrelativetoratiosderivedfromdemographicanaly-sisofdataonbirths,deaths,andmigration.Theinformationfromdemographicanalysisisinsufficienttoestimatecorrelationbiasatthepost-stratumlevel,how-ever,andalternativeparametricmodelshavebeenusedtoallocatecorrelationbiasestimatesfornationalage-race-sexgroupsdowntopost-strata.Estimatesofcorrelation biasatthenationallevelprovidedbydemographicanaly-sisinformationalsoaccountforpossibleerrorfromgroupswhoseprobabilitiesofenumerationaresolowthattheDSEwillfailtoaccountforthem.Theestimatesofcor-relationbiasbasedonsexratiosareaffectedbyerrorinthedemographic-analysissexratiosandbypossibleotherbiasesinthesexratiosintheDSE.D-4SectionIAppendixDErrorProfileforA.C.E.EstimatesU.S.CensusBureau,Census2000 TheassumptionsandmodelunderlyingthemeasurementofcorrelationbiasarediscussedindetailinBell (2001,2001b),butaredescribedbrieflyhere.Although thereareseveralmodelsforhowcorrelationbiasisdis-tributed,thetwo-groupmodelwasselected.Thetwo-groupmodelreliesonthebasicassumptionslistedbelow fortheestimationofcorrelationbias.Inaddition,sensitiv-ityanalysesassesstheimpactofvariationsinthese | |||
assumptions.*Theratioofmalestofemalesmeasuredindemographicanalysisismorereliableforthetworacialgroups,Blackandnon-Black,thantheA.C.E.*ThereisnocorrelationbiaspresentintheA.C.E.esti-matesforfemales.*TherelativecorrelationbiasisequalacrossallA.C.E.post-stratawithinanage-racecategory.*Therelativeimpactofothernonsamplingerrorsisequalformalesandfemalesatthenationallevel.Theassumptionwiththetwo-groupmodeloftherelativecorrelationbiasbeingequalacrosspost-stratawithinanage-sexcategoryhastheadvantageofpermittingtheesti-mationofcorrelationbiasthroughamultiplicativefactor appliedtothecorrectedDSE.Evenmoreimportant,anunbiasedestimateofthefactorisavailableundertheassumptionthattherelativeimpactoftheothernonsam-plingerrorsisequalformalesandfemaleswithout actuallyhavingtoestimatethenonsamplingerrors. | |||
Definition.CorrelationbiasusuallycausestheDSEtounderestimatethepopulationsize. | |||
Bcorrel , hcorrelationbiasintheadjustmentfactor Ah forpost-stratumh.Excluded-dataErrorfromReinstatedCensus Enumerations Source.TheDSEtreatslatecensusdataasnonenumera-tions.Thus,duplicateenumerationsamongthelatedata donotcontributetocensusdata,butvalidenumerationsamongthelatedataaretreatedascensusmissesandareestimatedbytheDSE.Ifthelatecensusdatawere excludedfromtheentireadjustmentprocessandestima-tion,nonewsourceoferrorwouldbepresent.Theadjustedestimatesdopartiallyincorporatelatecensus databyincludingtheminC h,butexcludingthemfromthecomputationof Nh.Thisuseoflatedataaffectstheesti-matesforareaswithdisproportionatelymanyorfewlateadds,withaneffectthatissimilartosyntheticestimation error.Inaddition,theexclusionoflatecensusdatafrom theEsamplecouldbiastheestimatesatthepost-stratum level.Therearetwoconditionsthathavetobemetfortheexclusionofthelateaddsfromtheprocessingofthe A.C.E.nottobiasthedualsystemestimatesatthepost-stratumlevel:*ThePsamplecoversthecorrectenumerationsamongthelateaddsatthesamerateasothercorrectenumera- | |||
tions.*ThelateaddsoccurintheEsampleatthesamerateastheyoccurinthecensus(excludingtheimputations). | |||
Definition.ErrorduetoexcludingthereinstatedcensusenumerationsinthecalculationoftheDSEaffectstheesti-mateoftheDSEandthereforetheadjustmentfactordirectly.Breinstate , hbiasintheadjustmentfactor Ahforpost-stratumhduetoexcludingreinstatedcensusenumerations.ContaminationError Source.ContaminationoccurswhentheA.C.E.selectionofagivenblockclusteralterstheimplementationofthecensusthereandaffectsenumerationresults,e.g,by increasingordecreasingerroneousenumerationsorbyincreasingordecreasingcoveragerates.Contactwithresi-dentsofthesampleblocksduringthelistingforthe P-samplemaycausethemtonotrespondtothecensus,becausetheythinkthatthelistingcontactisaresponsetothecensus. | |||
Definition.Thebiasintheadjustmentfactorforpost-stratumhfromcontaminationisdefinedasfollows. | |||
Bcontam , hbiasintheadjustmentfactor Ahforpost-stratumhduetocontaminationerror.SyntheticEstimationBias Source.Theadjustmentmethodologyreliesonamethodcalledsyntheticestimationtoprovidethesameadjust-mentfactor Ahforallenumerationsinagivenpost-stratum,regardlessofwhethertheenumerationsarefromthesamegeographicarea.Syntheticestimationbiasariseswhenthecensusfromdifferentareasbutinthesame post-stratumshouldhavedifferentadjustmentfactors.Toassesssyntheticestimationbiasforagivenareaoneneedstodevelopanestimatebasedondatafromthearea alone,whichisrarelypossible.Attemptstoestimatesyn-theticestimationbiasinundercountestimatesfromanaly-sisofartificialpopulationsorsurrogatevariables, whosegeographicdistributionsareknown,areunconvinc-ing.Therefore,sensitivityanalyseshavebeenconductedtoassesstheimpactofsyntheticestimationbias.These studiesshowthatassumingsyntheticestimationhasa minoreffectonusesofthedataisreasonable. | |||
Definition.Thesyntheticestimatesmaycauseabiasintheadjustedestimatesforareaj.Errorfromsyntheticesti-mationdoesnotaffectthedualsystemestimateforapost-stratum,onlyareaswithinapost-stratum. | |||
Bsyn , jbiasintheadjustedestimate Nadj , jforareajduetosyntheticestimationerror.SectionIAppendixDD-5ErrorProfileforA.C.E.EstimatesU.S.CensusBureau,Census2000 InconsistentPost-stratification Source.Thecomputationof CEhNE, hrequirescensusenu-merationstobeassignedtopost-strata,andthecomputa-tionof NP, hMhrequiresP-sampleenumerationstobeassignedtopost-strata.Whentheassignmentsarenotmadeconsistentlyforthetwosamples,errorarisesinthe ratio NP, hMh.Inconsistentassignmentstopost-stratamaybecausedbymis-reportingofcharacteristicsusedinpost-stratification.CasesthataremostpronetoinconsistentclassificationarethosewherethereisadifferentrespondentforthehouseholdinthecensusandtheA.C.E.Forexample,ahouseholdmembersageorracemaybereporteddiffer-entlyinaself-responsethanwhenanotherhouseholdmembersrespondsfortheperson.Suchinconsistenciesalsomaybeduetocomputerprocessingerrors,aswellas inconsistentreporting.ThematchesintheA.C.E.sampleprovideasourceofdataforestimatingtheerrorduetoinconsistentpost-stratifi-cation.Anestimateoftheerrorforapost-stratummaybeformedbyassumingtheinconsistencyrateobservedinthematchesalsoholdsforthosenotmatched. | |||
Definition.Errorduetoinconsistentpost-stratificationaffectstheestimateoftheDSEandthereforetheadjust-mentfactordirectly. | |||
Binconsist , hbiasintheadjustmentfactor Ahforpost-stratumhduetoinconsistentpost-stratification.ErrorfromEstimatingOutmoverswithInmovers Source.ThiserrorisuniquetothePES-CmodelusedintheA.C.E.ForthePES-Cmodel,themembersofthe P-samplearetheresidentsofthehousingunitsonCensusDay.ThereissomedifficultyinidentifyingalltheresidentsofallthehousingunitsonCensusDaybecausesome movepriortotheA.C.E.interview.TheA.C.E.interviewreliesontherespondentstoidentifythosewhohavemovedout,theoutmovers.Sincetheoutmoversareiden-tifiedbyproxies,manyoftheoutmoversarenotrecorded. | |||
Therefore,theestimateofoutmoversistoolow.PES-Cusesthenumberofinmoverstoestimatethenumberofoutmovers.Theinmoversarethosewhodidnotliveinthe sampleblocksonCensusDay,butmovedinpriortotheA.C.E.interview.Theoreticallythenumberofinmoversinthewholecountryshouldequalthenumberofoutmovers.However,thenumberofinmoversmaynotequalthenum-berofoutmoversinapost-stratumbecauseofcircum-stancessuchaseconomicconditionscausingmorepeople tomoveoutofanareathantomoveintoanarea. | |||
Definition.Theerrorduetousingtheinmoverstoesti-matetheoutmoversaffectstheestimatesofthesizeoftheP-samplepopulationandthenumberofmatches. | |||
m io , hnetP-sampledatacollectionerrorinthemovercomponentof Mhinpost-stratumh n io , hnetP-sampledatacollectionerrorinthemovercomponentof NP, hinpost-stratumh.Undertheassumptionthatallothererrorsarezero,thebiasintheadjustmentfactorscausedbyP-samplematch-ingerrorandE-sampleprocessingerrorisdefinedas Binout , hAhC hII h C hCEh NE, h[NP, hBPinout , h]MhBMinout , h.Theerrorcomponentdefinitionsincludearatioadjust-mentbecausetheyareestimatedusingtheEvaluation Sample.Theratioadjustmentforcomponentsfromthe P-sampleistheratiooftheP-samplepopulationtotalfromtheA.C.E.totheP-samplepopulationtotalbasedontheEvaluationSample.Thepost-stratumindexhissup-pressedinthefollowingdefinitions. | |||
BMinout[m io][NPNPF]BPinout[n io][NPNPF]BalancingError Source.BalancingerrormustbeaddressedinthedesignofthesearchareasusedtosearchforE-samplecorrect enumerationsandP-samplematches.Limitingthesearchforcorrectenumerationsandmatchesisnecessarybecausethematchingoperationcannotsearchtheentire census.Bylimitingthesearcharea,asmallpercentageof correctenumerationswillnotbefoundandasmallper-centageofmatcheswillnotbefound.Thiscausesanunderestimateofthecorrectenumerationsandanunder-estimateofthematches.However,theestimateoftheneterrorisnotbiasedaslongasthepercentageerrorinthecorrectenumerationsequalsthepercentageerrorinthe matches.TheA.C.E.designavoidsbalancingerrorbychoosingthesameblockclustersfortheE-sampleandtheP-sampleanddrawingthesearchareasconsistently. | |||
Definition.Thereisnotaseparatemeasurementofbal-ancingerror.AnybalancingerrorthatmayariseduringtheimplementationoftheA.C.E.willbeincludedinthemea-surementofdatacollectionerror.D-6SectionIAppendixDErrorProfileforA.C.E.EstimatesU.S.CensusBureau,Census2000 SectionI.ReferencesAdams,T.andLiu,X.(2001).ESCAPII:EvaluationofLackofBalanceandGeographicErrorsAffectingPersonEsti-mates,ExecutiveSteeringCommitteeforA.C.E.PolicyII,Report2.Bakerv.Carr369U.S.186(1962). | |||
Belin,T.,Diffendal,G.,Mack,S.,Rubin,D.,Schafer,J.,andZaslavsky,A.(1993).HierarchicalLogisticRegressionModelsforImputationofUnresolvedEnumerationStatus inUndercountEstimation,JournaloftheAmericanStatis-ticalAssociation,88,1149-1166.Bell,W.(2001).AccuracyandCoverageEvaluation:Corre-lationBias,DSSDCensus2000ProceduresandOpera-tionsMemorandumSeries#B-12*.Bell,W.(2001b).ESCAPII:EstimationofCorrelationBiasin2000A.C.E.EstimatesUsingRevisedDemographic AnalysisResults,ExecutiveSteeringCommitteeforA.C.E. | |||
PolicyII,Report10.Byrne,R.,Imel,L.,Ramos,M.,andStallone,P.(2001).AccuracyandCoverageEvaluation:PersonInterviewing Results,Census2000ProceduresandOperationsMemo-randumSeries#B-5*.Cantwell,P.(1999).AccuracyandCoverageEvaluationSurvey:OverviewofMissingDatafo rP&ESamples,DSSDCensus2000ProceduresandOperationsMemoran-dumSeries#Q-3.Cantwell,P.(2000).AccuracyandCoverageEvaluationSurvey:SpecificationsfortheMissingDataProcedures,DSSDCensus2000ProceduresandOperationsMemoran-dumSeries#Q-25.Cantwell,P.,McGrath,D.Nguyen,N.,andZelenak,M.(2001).AccuracyandCoverageEvaluation:MissingDataResults,DSSDCensus2000ProceduresandOperations MemorandumSeries#B-7*.Childers,D.(2001).AccuracyandCoverageEvaluation:TheDesignDocument,Census2000Proceduresand OperationsMemorandumSeries,ChapterS-DT-1,Revised.Childers,D.,Byrne,R.,Adams,T.,andFeldpausch,R.(2001).AccuracyandCoverageEvaluation:PersonMatch-ingandFollowupResults,Census2000ProceduresandOperationsMemorandumSeries#B-6*.Coale,A.(1955).ThePopulationoftheUnitedStatesin1950ClassifiedbyAge,Sex,andColorARevisionofCen-susFigures,JournaloftheAmericanStatisticalAssocia-tion,50,16-54.Coale,A.andRives,N.(1973).AStatisticalReconstruc-tionoftheBlackPopulationoftheUnitedStates,1880-1970:EstimatesofTrueNumbersbyAgeandSex,BirthRates,andTotalFertility,PopulationIndex,39(1),3-36.Coale,A.andZelnick,M.(1963).NewEstimatesofFertil-ityandPopulationintheUnitedStates,PrincetonUniver-sityPress.Cox,L.andErnst,L.(1982).ControlledRounding,INFOR,Vol.20,No.4.Davis,M.(1991).PreliminaryFinalReportforPESEvalua-tionProjectP7:EstimatesofP-sampleClericalMatching ErrorfromaRematchingEvaluation,1990CoverageStud-iesandEvaluationMemorandumSeries#H-2.Davis,M.(1991b).PreliminaryFinalReportforPESEvalu-ationProjectP10:MeasurementoftheCensusErroneous Enumerations-ClericalErrorMadeintheAssignmentofEnumerationStatus,1990CoverageStudiesandEvalua-tionMemorandumSeries#L-2.Davis,P.(2001).AccuracyandCoverageEvaluation:DualSystemEstimationResults,DSSDCensus2000Proce-duresandOperationsMemorandumSeries#B-9*.Fay,R.,Passel,J.,Robinson,J.G.,andCowan,C.(1988).TheCoverageofPopulationinthe1980Census,1980CensusofPopulationandHousing:Evaluationand ResearchReports,PHC80-E4,U.S.BureauoftheCensus,Washington,D.C.Ghosh,M.andRao,J.N.K.(1994).SmallAreaEstimation:AnAppraisal,StatisticalScience,Vol.9,No.1,55-93.Gonzalez,M.(1973).UseandEvaluationofSyntheticEsti-mators,ProceedingsoftheSocialStatisticsSection, AmericanStatisticalAssociation.Gonzalez,M.andWaksberg,J.(1973).EstimationoftheErrorofSyntheticEstimates,paperpresentedatthefirst meetingoftheInternationalAssociationofSurveyStatisti-cians,Vienna,Austria,August18-25,1973.Griffin,R.(1999).AccuracyandCoverageEvaluationSur-vey:Post-stratificationResearchMethodology,DSSDCen-sus2000ProceduresandOperationsMemorandumSeries | |||
#Q-5.SectionIReferences1ReferencesU.S.CensusBureau,Census2000 Griffin,R.andHaines,D.(2000).AccuracyandCoverageEvaluationSurvey:PoststratificationforDualSystemEsti-mation,DSSDCensus2000ProceduresandOperations MemorandumSeries#Q-21.Griffin,R.andHaines,D.(2000b).AccuracyandCoverageEvaluationSurvey:FinalPoststratificationPlanforDualSystemEstimation,DSSDCensus2000ProceduresandOperationsMemorandumSeries#Q-24.Griffin,R.andMalec,D.(2001).AccuracyandCoverageEvaluation:AssessmentofSyntheticAssumption,DSSDCensus2000ProceduresandOperationsMemorandum Series#B-14*.Haines,D.(2001).AccuracyandCoverageEvaluationSur-vey:SyntheticEstimation,DSSDCensus2000Procedures andOperationsMemorandumSeries#Q-46.Haines,D.(2001b).AccuracyandCoverageEvaluationSurvey:ComputerSpecificationsforPersonDualSystem Estimation(U.S.)-Re-issueofQ-37,DSSDCensus2000ProceduresandOperationsMemorandumSeries#Q-48.Himes,C.andClogg,C.(1992).AnOverviewofDemo-graphicAnalysisasaMethodforEvaluatingCensusCov-erageintheUnitedStates,PopulationIndex,58(4),587-607.Hogan,H.(1992).The1990Post-EnumerationSurvey:AnOverview,TheAmericanStatistician,Vol.46(4),261-269.Hogan,H.(1993).The1990Post-EnumerationSurvey:OperationsandResults,JournaloftheAmericanStatisti-calAssociation,88,1047-1060.Hogan,H.(2000).TheAccuracyandCoverageEvaluation:TheoryandApplication,ProceedingsoftheSurveyResearchMethodsSection,AmericanStatisticalAssocia- | |||
tion.Hogan,H.(2001).AccuracyandCoverageEvaluationSur-vey:EffectofExcludingLateCensusAdds,DSSDCensus 2000ProceduresandOperationsMemorandumSeries | |||
#Q-43.Ikeda,M.(1997).EffectofUsingthe1996ICMCharacter-isticImputationandProbabilityModelingMethodologyonthe1995ICMPandE-SampleData,DSSDCensus2000DressRehearsalMemorandumSeries#A-20.Ikeda,M.(1998).EffectofDifferentMethodsforCalculat-ingMatchandResidenceProbabilitiesforthe1995P-SampleData,DSSDCensus2000DressRehearsal MemorandumSeries#A-23.Ikeda,M.(1998b).EffectofDifferentMethodsforCalcu-latingCorrectEnumerationProbabilitiesforthe1995 E-SampleData,DSSDCensus2000DressRehearsal MemorandumSeries#A-28.Ikeda,M.(1998c).EffectofUsingSimpleRatioMethodstoCalculateP-SampleResidenceProbabilitiesand E-SampleCorrectEnumerationProbabilitiesforthe1995 Data,DSSDCensus2000DressRehearsalMemorandum Series#A-30.Ikeda,M.,Kearney,A.,andPetroni,R.(1998).MissingDataProceduresintheCensus2000DressRehearsalInte-gratedCoverageMeasurementSample,Proceedingsof theSurveyResearchMethodsSection,AmericanStatistical Association.Ikeda,M.andMcGrath,D.(2001).AccuracyandCoverageEvaluationSurvey:SpecificationsfortheMissingDataPro-cedures;RevisionofQ-25,DSSDCensus2000ProceduresandOperationsMemorandumSeries#Q-62.Kearney,A.andIkeda,M.(1999).HandlingofMissingDataintheCensus2000DressRehearsalIntegratedCov-erageMeasurementSample,ProceedingsoftheSurvey ResearchMethodsSection,AmericanStatisticalAssocia-tion.Keathley,D.,Kearney,A.,andBell,W.(2001).ESCAPII:AnalysisofMissingDataAlternativesfortheAccuracyandCoverageEvaluation,ExecutiveSteeringCommitteeforA.C.E.PolicyII,Report12.Keeley,C.(2000).Census2000AccuracyandCoverageEvaluationComputerAssistedInterview,DSSDCensus2000ProceduresandOperationsMemorandumSeries | |||
#S-QD-02.Killion,R.A.(1998).EstimationDecisionsfortheInte-gratedCoverageMeasurementSurveyforCensus2000, Census2000DecisionMemorandumNo.42.Kostanich,D.,Griffin,R.,andFenstermaker,D.(1999).Census2000AccuracyandCoverageEvaluationSurvey:SampleAllocationandPost-stratificationPlans,DSSDCen-sus2000ProceduresandOperationsMemorandumSeries | |||
#R-2.Marks,E.(1979).TheRoleofDualSystemEstimationinCensusEvaluation,inK.Krotki(Ed.),Developmentsin DualSystemEstimationofPopulationSizeandGrowth, Edmonton:UniversityofAlbertaPress,156-188.Nash,F.(2000).OverviewoftheDuplicateHousingUnitOperations,Census2000InformationMemorandumNumber78.NationalCenterforHealthStatistics(1968).SyntheticStateEstimatesofDisability,P.H.S.Publication1759,U.S.GovernmentPrintingOffice,Washington,D.C.Raglin,D.andBean,S.(1999).OutmoverTracingandInterviewing,Census2000DressRehearsalEvaluation ResultsMemorandumSeries#C-3.Rao,J.N.K.andShao,J.(1992).JackknifeVarianceEstima-tionwithSurveyDataUnderHotDeckImputation,Biometrika,79,811-822.Reynoldsv.Simms,377U.S.533(1964).2SectionIReferencesReferencesU.S.CensusBureau,Census2000 Robinson,J.G.(2001).ESCAPII:DemographicAnalysisResults,ExecutiveSteeringCommitteeforA.C.E.PolicyII, Report1.Robinson,J.G.,Ahmed,B.,Gupta,P.,andWoodrow,K.(1993).EstimationofPopulationCoverageinthe1990 UnitedStatesCensusBasedonDemographicAnalysis,JournaloftheAmericanStatisticalAssociation,88,1061-1071.Schindler,E.(1998).AllocationoftheICMSampletotheStatesforCensus2000,ProceedingsoftheSurvey ResearchMethodsSection,AmericanStatisticalAssocia-tion.Schindler,E.(1999).ComparisonofDSECandDSEA,Census2000DressRehearsalEvaluationMemorandum#C-8a.Schindler,E.(2000).AccuracyandCoverageEvaluationSurvey:Post-stratificationPreliminaryResearchResults, DSSDCensus2000ProceduresandOperationsMemoran-dumSeries#Q-23.Sekar,C.C.andDeming,W.E.(1949).OnaMethodofEstimatingBirthandDeathRatesandtheExtentofRegis-tration,JournaloftheAmericanStatisticalAssociation, 44,101-115.Siegel,J.(1974).EstimatesofCoverageofPopulationbySex,Race,andAge:DemographicAnalysis,1970Census ofPopulationandHousing:EvaluationandResearchPro-gram,PHC(E)-4,U.S.BureauoftheCensus,Washington, D.C.Siegel,J.andZelnik,M.(1966).AnEvaluationofCover-ageinthe1960CensusofPopulationbyTechniquesof DemographicAnalysisandbyCompositeMethods,Pro-ceedingsoftheSocialStatisticsSection,AmericanStatisti-calAssociation.Starsinic,M.(2001).AccuracyandCoverageEvaluationSurvey:SpecificationsforCovarianceMatrixOutputFiles fromVarianceEstimationforCensus2000,DSSDCensus 2000ProceduresandOperationsMemorandumSeries | |||
#V-4.Starsinic,M.andKim,J.(2001).AccuracyandCoverageEvaluation:ComputerSpecificationsforVarianceEstima-tionforCensus2000-Revision,DSSDCensus2000Pro-ceduresandOperationsMemorandumSeries#V-5.U.S.BureauoftheCensus(1985).EvaluatingCensusesofPopulationandHousing,StatisticalTrainingDocument, ISP-TR-5,Washington,D.C.West,K.(1991).FinalReportforPESEvaluationProjectP4:QualityofReportedCensusDayAddress-EvaluationFollow-up,1990CoverageStudiesandEvaluationMemo-randumSeries#D-2.Winkler,W.(1994).AdvancedMethodsforRecordLink-age,ProceedingsoftheSurveyResearchMethodsSec-tion,AmericanStatisticalAssociation.Wolfgang,G.(1999).RequestforDressRehearsalSur-roundingBlockFilesforA.C.E.Research,unpublishedCensusBureauMemorandum.Wolter,K.(1986).SomeCoverageErrorModelsforCensusData,JournaloftheAmericanStatisticalAssociation,81, 338-346.Woltman,H.,Alberti,N.,andMoriarity,C.(1988).SampleDesignforthe1990CensusPostEnumerationSurvey,ProceedingsoftheSurveyResearchMethodsSection, AmericanStatisticalAssociation.ZuWallack,R.(2000).SampleDesignfortheCensus2000AccuracyandCoverageEvaluation,ProceedingsoftheSurveyResearchMethodsSection,AmericanStatistical | |||
Association.SectionIReferences3ReferencesU.S.CensusBureau,Census2000 AccuracyandCoverageEvaluationofCensus2000:DesignandMethodologySectionIIA.C.E.RevisionIIMarch2003U.S.CensusBureau,Census2000 Chapter1.IntroductiontoA.C.E.RevisionII INTRODUCTIONTheAccuracyandCoverageEvaluation(A.C.E.)surveywasdesignedtomeasureandpossiblycorrectnetcoverageerrorinCensus2000.However,becauseA.C.E.failedto measureasignificantnumberoferroneousenumerations,A.C.E.didnotmeettheseobjectives.TheCensusBureausExecutiveSteeringCommitteeforA.C.E.Policy(ESCAP) recommendedtwice NOTtocorrectthecensuscounts. | |||
1Thereare,however,concernsaboutdifferentialcoverageerrorinCensus2000data.WhiletheCensus2000dataproductswillnotbecorrected,itispossiblethatimprove-mentscouldbemadetothepost-censalpopulationesti-matesusedforsurveycontrols.ThisistheCensusBureausmotivationforcorrectingerrorsintheA.C.E.dataanddevelopingimprovedestimatesofthenetundercount. | |||
TheimprovedestimatesarecalledA.C.E.RevisionIIesti-mates.TheserevisedestimatesprovideabetterpictureofCensus2000coverageandwillhelpusdesignabettercoveragemeasurementprogramfor2010.Thispartofthe documentprovidesadescriptionofthemethodologyusedtoproducetheA.C.E.RevisionIIestimates.Acomprehen-sivetechnicaldescriptionofthemethodologyusedtopro-ducetheoriginalestimatesofnetundercountreleasedinMarch2001ispresentedinthefirsthalfofthispublication.Thischaptersummarizesthehistoryofthetwoadjust-mentdecisionsanddiscusseskeyfindingsandlimitations.Italsointroducesthekeycomponentsoftherevisionand describesthemajorerrorsbeingcorrected.Thenext chapterprovidesanoverviewoftherevisionprocessandsubsequentchaptersprovidedetailedmethodologyas | |||
follows:Chapter2:SummaryofA.C.E.RevisionII MethodologyChapter3:CorrectingDataforMeasurementError Chapter4:A.C.E.RevisionIIMissingDataMethodsChapter5:FurtherStudyofPersonDuplicationinCensus2000Chapter6:A.C.E.RevisionIIEstimationChapter7:AssessingtheEstimates BACKGROUNDTheoriginalA.C.E.estimateswereavailableinFebruaryof2001,intimetoallowforthepossibilityofcorrecting Census2000redistrictingfiles.TheCensusBureausESCAPrecommendedinMarch2001nottocorrecttheCensus2000countsforpurposesofredistricting(ESCAPI, 2001).TheSecretaryofCommerceconcurred.Giventheinformationavailableatthistime,thisdecisionwasnotbasedonanyclearevidencethatthecensuscountswere moreaccurate,butratherconcernthattherewassomeyetundiscoverederrorintheA.C.E.TheA.C.Eestimateofa3.3millionnetundercountwasmuchlargerthanthe DemographicAnalysis(DA)estimateofonly340,000. | |||
Furtherevaluationswereconductedoverthenext6monthstoexaminethereasonsforthediscrepancyandtodetermineifotherCensus2000dataproductsshouldbe corrected.TheCensus2000redistrictingfileswerethefirstofmanyCensus2000dataproductsscheduledforpublicrelease.SeetheCensusBureausWebsite, www.census.gov,forreleaseddataproducts.Thequestion remainedastowhethertheseotherCensus2000datareleasesshouldbecorrected.InOctober2001,theESCAPagaindecidednottocorrectthecensuscountsforotherCensus2000dataproducts. | |||
AnalysisofA.C.E.evaluationdataandastudyofdupli-catesinthecensusrevealedthattheA.C.E.failedtomeasurelargenumbersoferroneouscensusenumerations (ESCAPII,2001).ThiserrorcalledintoquestionthequalityoftheA.C.E.surveyresults.Someofthekeyfindingsfromtheanalysesare:*AnevaluationbyKrejsaandRaglin(2001)wasthefirstindicationthatA.C.E.seriouslyunderestimatederrone-ousenumerations.Thisanalysisrevealedanadditionalnet1.9millionerroneouslyenumeratedpersonsforthosecasesthatcouldberesolved.Theseresultsare basedonanindependentreinterviewandmatchingof about70,000E-samplepersons.Becauseoftheseriousimplicationsofthisfinding,afurtherReviewStudywas conducted.*ThefindingsfromtheReviewStudybyAdamsandKrejsa(2001)showedthatA.C.E.underestimatederro-neousenumerationsbyanetof1.45millionpersons,whichwassmallerthantheevaluationfigurebutstillasignificantlylargeamount.Thisfiguredoesnotinclude unresolvedcases,sotheestimatedamountisprobablysomewhathigher.Thisstudywasbasedonasampleof 1TheESCAPrecommendations,supportinganalyses,technicalassessments,andlimitationscanbefoundontheCensusBureausWebsiteatwww.census.gov/dmd/www/EscapRep.html.SectionIIChapter11-1IntroductiontoA.C.E.RevisionIIU.S.CensusBureau,Census2000 about17,000personsselectedfromthe70,000personevaluationfollow-upEsample.Themostexperienced analystsreviewedthesecasesusingboththeoriginal A.C.E.personfollow-upinterviewsaswellasthereinter-viewresultstodeterminetheirenumerationstatus.*Mule(2001)showedthatCensus2000suffersfromalargenumberofduplicateenumerations,i.e.,personswhoweredoublecounted.Mulecomputer-matched censusenumerationsinA.C.E.blockclusterstothoseacrosstheentirecountry.ThematchingusedbyMulewasconservativeinpickingupcensusduplicatesgiven hisrequirementforexactmatchingatthefirststage.WithintheA.C.E.blockclusters,Mulefoundonly38percentofthein-scopeduplicatesthatA.C.E.found, leadingustobelievethathismatchingalgorithmwasunderestimatingduplicatesinthecensus.NotethatA.C.E.wasnotdesignedtoestimateduplicatesoutside thesearcharea,andthis,itself,wasnotadesignflaw.A.C.E.was,however,expectedtodeterminewhichcensusenumerationswereerroneousbecausetheywere reportedatthewrongresidence.ThedesignofA.C.E. | |||
RevisionIIaccountsforduplicatesoutsidethesearcharea.Mulesstudydidnotdistinguishwhichoftheduplicatepairwascorrectandwhichwaserroneous, butonecouldeasilyspeculatethathalfoftheseshouldbecorrectandhalfshouldbeerroneous.*Feldpausch(2001)examinedtheA.C.E.enumerationstatusforE-samplecasesidentifiedbyMule(2001)asduplicatesoutsidethesearcharea.Only14percentoftheE-samplepersonsthatwereduplicatesofaperson inahousingunitwerecodedaserroneousbyA.C.E. | |||
Thiswasmuchlowerthantheexpected50percent,indicatingthatA.C.E.underestimatederroneousenu-merationsduetonotperceivingthattheseE-sample personsshouldhavebeencountedatotherresidences.NotethattheseresultssuggestmeasurementerrorintheoriginalA.C.E.figuresreleased.*Fay(2001,2002)thencomparedtheenumerationstatusfortheE-sampleReviewStudycasestotheduplicatesidentifiedbyMule(2001)outsidethesearcharea.Only 19percentofthereviewcasesthatwereduplicatesofapersoninahousingunitwerecodedaserroneousbytheReviewStudy.Again,thiswasmuchlessthanthe expected50percent,indicatingthattheevaluationdata andthespecialreviewdidnotidentifyalltheerroneousenumerations.Usingthesedata,Faythenproducedalowerboundonthelevelofunmeasurederroneous enumerationsof2.9million.*Therewasalsoevidencethatsimilarproblemsmayhaveaffectedthepopulationsample(Psample)whichis usedtomeasuretheomissionrate.A.C.E.evaluationdatafromRaglinandKrejsa(2001)showthattherearemeasurementerrorsindeterminingresidencyand moverstatus.UsingFayslowerboundonthelevelofunmeasurederro-neousenumerations,Thompsonetal.(2001)produceda RevisedEarlyApproximationofundercountforthree race/Hispanicorigingroups.Theseestimateswere intendedtobeillustrativeofnetundercountandpossible coveragedifferences.Thesamemethodologyanddata werelaterusedtoexpandthecalculationstoseven race/Hispanicorigingroups.SeeFay(2002)andMule (2002)fordetails.Thesepreliminaryestimatesshowa verysmallnetundercount.Thedataalsoindicatethatthe differentialundercounthasnotbeeneliminated.These resultsarelimitedtotheextentthattheyonlyprovide informationatthenationallevelforbroadpopulation groups.Furthermore,thesepreliminaryapproximations werebasedonasmallsubsetofA.C.Edataandonlypar-tiallycorrectforerrorsinmeasuringerroneousenumera-tions.Potentialerrorsinmeasuringomissionswerenot accountedfor.Insummary,theA.C.E.resultswerenotacceptablebecauseA.C.E.failedtomeasurelargenumbersoferrone-ouscensusenumerations.Thiswasthereasonfornot usingtheA.C.E.,butthisdoesnotmeanthattherewere noothererrorsintheA.C.E.Inparticular,therewascon-cernaboutP-samplecasesthatmatchedtoenumerations suspectedofbeingduplicates.IftheE-samplecasewas erroneous,thenthatmatchcannotbevalid.Theextentof thisproblemwasnotquantifiedatthetimeoftheESCAPIIdecision.Thelevelofothererrorswassmallbycompari-son,andtherefore,wasnotamajorfactorinthisdecision. | |||
SeeHoganetal.(2002)andMulryandPetroni(2002)for furtherinformation.PlansforRevisingthe2000A.C.E.EstimatesEventhoughtheESCAPrecommendedtwicenottocorrectthecensuscounts,theyhadconcernsaboutdifferential coverageerrorinCensus2000data.Theythoughtitpos-siblethatfurtherresearchresultinginrevisedestimatesof coveragecouldpotentiallybeusedtoimprovethepost-censalestimates.Inaddition,revisedestimateswould provideabetterunderstandingofCensus2000coverage errorthatcouldbeusedtoimprovecensusoperationsfor 2010andwouldhelpindevelopingbettermethodologiesforthe2010coveragemeasurementprogram.ThemajorobjectivewastoproduceimprovedestimatesofthehouseholdpopulationthatcouldbeusedtomeasurenetcoverageerrorinCensus2000.Thismeantobtaining betterestimatesoferroneouscensusenumerationfromtheEsampleandobtainingbetterestimatesofcensusomissionsfromthePsample.Furthermore,sincethenationalnetundercount,asindicatedbybothDAandthe1-2SectionIIChapter1IntroductiontoA.C.E.RevisionIIU.S.CensusBureau,Census2000 RevisedEarlyApproximations,wasveryclosetozeroandthecensusincludedlargenumbersoferroneousenu-merationsintheformofduplicates,itwasimperativethat therevisedmethodologycarefullyaccountforbothover-countsandundercounts.Hogan(2002)summarizedthe majorrevisionissuesintheformofthefollowingfive | |||
challenges:1.Improveestimatesoferroneouscensusenumerations2.Improveestimatesofcensusomissions 3.Developnewmodelsformissingdata 4.Enhancetheestimationpost-stratification5.ConsideradjustmentforcorrelationbiasTherewerenofieldoperationsassociatedwiththeA.C.E.RevisionIIprocess.Becauseofthelatedate,itwasnot feasible(orpractical)torevisithouseholdsforadditional datacollection.Consequently,therevisionswerebasedondatathathadalreadybeencollected.Oneaspectofthestrategyforrevisingthecoverageestimatesinvolvescor-rectingmeasurementerrorusinginformationfromtheA.C.E.evaluationdata.Thisisreferredtoastherecodingoperation.Anotheraspectofthesecorrectionsinvolves conductingamoreextensivepersonduplicatestudytocorrectformeasurementerrorthatwasnotdetectedbyA.C.E.evaluations.ThisisreferredtoastheFurtherStudy ofPersonDuplication(FSPD).Theestimationmethod, discussedbrieflyinChapter2andmorefullyinChapter6,isdesignedtohandleoverlapoferrorsdetectedbybothofthesestudiestoavoidovercorrectingformeasurement error.Therecodingoperationwasdesignedtoimproveesti-matesoferroneouscensusenumerationsandcensus omissions.ItusestheoriginalA.C.E.personinterview(PI)andpersonfollow-up(PFU),theevaluationfollow-upinter-view(EFU),thematchingerrorstudy(MES),andthe PFU/EFUReviewStudy 2tocorrectformeasurementerrorinenumerationstatus,residencestatus,moverstatusand matchingstatus.Thiseffortinvolvedextensiverecodingofabout60,000P-samplecasesandmorethan70,000E-samplecases. | |||
3Anautomatedcomputeralgorithmwasusedtorecodemostofthecases,butmanyrequiredaclericalreviewbyexperiencedanalystsattheNationalProcessingCenter.Theanalystshadaccesstotheques-tionnaireresponses,aswellasinterviewernotesthatputtheminabetterpositiontoresolveapparentdiscrepan-cies.Itwasnotpossibletocompletelycodeallcases becauseofmissingorconflictinginformation;however, thenumberofconflictingcaseswasrelativelysmall.Theduplicatestudywasdesignedtofurtherimproveestimatesoferroneouscensusenumerationsandcensusomissions.Thisstudyusedcomputermatchingandmod-elingtechniquestoidentifyE-andP-samplecasesthatlink tocensusenumerationsacrosstheentirecountry,includ-inggroupquarters,reinstated,anddeletedcensuscases.FortheE-samplelinks,thisstudydoesnotidentifywhich enumerationiscorrectandwhichistheduplicate.ForP-samplelinks,thisstudydoesnotidentifywhetherthecorrectCensusDayresidenceisattheP-samplelocation orthecensuslocation.ThisinformationisusedtomodeltheprobabilitythatanE-samplelinkedcaseisacorrectenumerationorthataP-samplecaseisaresidenton CensusDay.Newmissingdatamodelsweredevelopedtoreflectthedifferenttypesofmissingdatanowpossibleasaresultof therecodingoperation.Therewerethreenewtypesofmissingdatatodealwith:1.P-samplehouseholdsthatwereoriginallyconsideredinterviews,buttherecodingdeterminedthattherewerenovalidCensusDayresidents,2.caseswithunresolvedmatch,enumeration,orresi-dencystatusbecauseofincompleteorambiguousinterviewdata,and3.caseswithconflictingenumerationorresidencysta-tus,becausecontradictoryinformationwascollected intheA.C.E.PFUandEFUinterviews.Itwasimpossibletodeterminewhichdatawerevalidforthesecases.Ahouseholdnoninterviewweightingadjust-mentusingnewcelldefinitionswasusedfortype1above.Imputationcellsanddonorpoolsweredevelopedforthesecondtypeofmissingdatabasedondetailed responsestothequestionnaire.Fortheconflictingcasesintype3above,therewerenoapplicabledonorpools,andprobabilitiesof0.5wereimputedforcorrectenumera-tionstatusandCensusDayresidencystatus.Fortunately,therecodingoperationresultedinarelativelysmallnum-berofthesecases.Therevisioneffortincorporatesseparatepost-strataforestimatingcensusomissionsanderroneouscensusenu-merationsbecausethefactorsrelatedtoeachoftheseare likelytobedifferent.Theresearcheffortfocusedondeter-miningvariablesrelatedtoerroneousenumerations.Thiswasbecausemuchofthepreviousworkondevelopingpost-stratafocusedonlyonthecensusomissions,andby default,thesamepost-stratawereappliedtotheerrone-ousinclusions.FortheEsample,someoftheoriginalpost-stratificationvariableshavebeeneliminatedand additionalvariableshavebeenincluded.Variablessuchas 2ThePFU/EFUReviewStudywasnotaplannedevaluation.Itwasaspecialstudyconductedinasubsampleoftheevaluation datatoresolvediscrepanciesbetweenenumerationstatusinthe PFUandEFU. | |||
3TheseareprobabilitysubsamplesoftheoriginalA.C.E.PandEsamplesandinthecontextofA.C.E.RevisionIIarecalledrevi-sionsamples,buttheyareinfactequivalenttotheevaluationfollow-upsamples.SectionIIChapter11-3IntroductiontoA.C.E.RevisionIIU.S.CensusBureau,Census2000 region,MetropolitanStatisticalArea/typeofcensusenu-merationarea,andtract-levelreturnratewerereplacedby proxystatus,typeanddateofcensusreturn,andhouse-holdrelationshipandsize.ForthePsample,onlytheage variablewasmodifiedtodefineseparatepost-stratafor childrenaged0to9andthose10to17.Thiswasdone becausetheDAestimatessuggesteddifferentcoverage forthesegroups.Theestimatedcorrectenumerationrates andestimatedmatchratesareusedtocalculateDual SystemEstimates(DSEs)forthecross-classificationofthe EandPpost-strata.TheA.C.E.RevisionIIDSEsincludeanadjustmentforcorrelationbias.Correlationbiasexistswhenevertheprobabilitythatanindividualisincludedinthecensusis dependentontheprobabilitythattheindividualisincludedintheA.C.E.Thisformofbiasgenerallyhasadownwardeffectonestimates,becausepeoplemissedin thecensusmaybemorelikelytoalsobemissedintheA.C.E.SincetheintentoftheA.C.E.RevisionIIistoesti-matethenetcoverageerror,itisimportanttocarefully accountforerrorsofomissionsanderrorsoferroneous inclusions.Inpreviouscoveragemeasurementsurveys,theerroneousinclusionswereassumedtobemuchsmallerthanomissions.Consequently,notadjustingfor correlationbiashadtheeffectofunderstatingthenetundercountandrelativetothecensuswasacorrectionthatwasintherightdirection,butjustnotbigenough.In thepresenceoflargenumbersofovercounts,thisassump-tionisnolongervalidanditspossiblethatacorrectionmightnotevenbeintherightdirectionwhentheestimateisclosetozero.Forexample,ifthereisasmalltruenet undercount,itspossibletoestimateanovercountbecause theDSEwouldunderestimatepopulationinthepresence ofcorrelationbias.Estimatesofcorrelationbiaswerecal-culatedusingthetwo-groupmodelandsexratiosfrom DA.Thesexratioisdefinedasthenumberofmales dividedbythenumberoffemales.Thismodelassumesno correlationbiasforfemalesorformaleslessthan18years ofage,andthatBlackmaleshavearelativecorrelation biasthatisdifferentfromtherelativecorrelationbiasfor non-Blackmales.Thecorrelationbiasadjustmentisalso donebythreeagecategories:18-29,30-49,and50and overwiththeexceptionofnon-Blackmales18to29years ofage.ThisisbecausetheA.C.E.RevisionIIsexratiosfor non-Blacks18-29exceedthecorrespondingmodifiedDA sexratioandislikelyaresultofadataproblem.This modelfurtherassumesthatrelativecorrelationbiasis constantovermalepost-stratawithinagegroups.TheDSEs,adjustedforcorrelationbias,areusedtopro-ducecoveragecorrectionfactorsforeachofthecross-classifiedpost-strata.Thesefactorsareappliedorcarrieddownwithinthepost-stratatoproduceestimatesfor geographicareassuchascountiesorplaces.Thisprocessisreferredtoassyntheticestimation.Thekeyassumptionunderlyingthismethodologyisthatthenetcensus coverage,estimatedbythecoveragecorrectionfactor,isrelativelyuniformwithinthepost-strata.Failureofthisassumptionleadstosyntheticerror.1-4SectionIIChapter1IntroductiontoA.C.E.RevisionIIU.S.CensusBureau,Census2000 Chapter2.SummaryofA.C.ERevisionIIMethodology INTRODUCTIONTheoriginalA.C.E.estimateswerefoundtobeunaccept-ablebecausetheyfailedtodetectsignificantnumbersof erroneouscensusenumerations.Therewerealsosuspi-cionsthattheA.C.E.mayhaveincludedresidentsinitsPsamplethatwereactuallynonresidents.Thus,themajor goalinrevisingtheA.C.E.estimatesincludedacorrectionofthesemeasurementerrors.Oneaspectofthesecorrec-tionsinvolvedcorrectingasubsampleoftheA.C.E.data. | |||
Anotheraspectinvolvedcorrectingmeasurementerrorsthatcouldnotbedetectedwiththeinformationavailableinthesubsample.Theseadditionalerrorswereidentified viaaduplicatestudy.Thepurposeofthischapteristopresentahigh-leveloverviewoftheprocessusedtopro-duceA.C.E.RevisionIIestimatesofthepopulationcover-ageofCensus2000.Furtherdetailsconcerningthemeth-odologyandproceduresareincludedinsubsequent chapters.BackgroundThechronologyofeventsleadingtothecorrectedA.C.E.RevisionIIresultswereasfollows:1.TheA.C.E.estimatesproducedinMarch2001werebasedontheFullEandPsamples,whichwereprob-abilitysamplesofover700,000personsin11,303blockclusters.2.TheMatchingErrorStudy(MES)andtheEvaluationFollow-up(EFU)weretwoprogramsthatevaluatedthe March2001A.C.E.estimates.TheMESmeasurederrorsintroducedwhenthecensusandA.C.E.inter-viewswerematched.TheEFU,whichwasdesignedto studyunusuallivingsituations,entailedanotherinter-view.ItevaluatedtheCensusDayresidency,enumera-tionstatusandmoverstatusassignedduringthe A.C.E.interviewandA.C.E.PersonFollow-up(PFU)interview.TheMESandEFUwereconductedinasub-sampleof2,259blockclustersselectedfromtheorigi-nal11,303blockclusters.Afurthersubsampleofper-sonswithintheseblockclusterswasselectedfortheEFUevaluation.3.ThePFU/EFUReviewoccurrednext;itwasnotpartoftheplannedevaluations.Itwasdoneinorderto resolvemajordiscrepanciesinenumerationstatusbetweentheEFUandPFUresults.Thus,theReviewEsamplewasasubsampleoftheEFUEsample.4.AtthispointtheA.C.E.RevisionIIprogramcom-menced.TheRevisionEandPsamplesweredevel-opedforpurposesofproducingA.C.E.RevisionIIesti-mates.Theyareeachcomprisedofabout70,000samplepersons.ThesesampleswereessentiallythesameastheevaluationEandPsamplesforEFU,but thedatahaveundergoneamajorrecodingtocorrectformeasurementerror.Thesedata,alongwithothermeasurementerrorcorrectionsidentifiedbythedupli-catestudy,wereusedtoadjusttheFullEandPsamplestoproduceA.C.E.RevisionIIestimates.TheA.C.E.RevisionIIprocessispresentedbelow.First,thecorrectionsformeasurementerror(undetectederroneousenumerationsandP-samplenonresidents)intheRevision Samplesareexplained.Then,adiscussionisgivenofthe missingdatamethodsappliedtocaseswhosematch,resi-dencyorenumerationstatushadchangedintheRevisionSamples.Next,theprocessforidentifyingcensusdupli-catesacrosstheentirenationisdiscussed.Anapplicabledualsystemestimationformulathatincorporatesthesechangesandaccountsforcorrelationbiasispresented. | |||
Finally,syntheticestimationwasemployedtoproduceA.C.E.RevisionIIresults.SeeKostanich(2003)forasum-maryofthemethodology.CORRECTINGMEASUREMENTERRORINTHEREVISIONSAMPLESAspreviouslystated,theoriginalA.C.E.process(step1.above)failedtodetectsignificantnumbersoferroneouscensusenumerations(EEs).TheseundetectedEEs(onepartofmeasurementerrorintheA.C.E.)wereuncovered duringtheevaluationsoftheA.C.E.(step2.above).Ingeneral,theoriginalA.C.E.PersonInterview(PI)andPFU,theEFUinterview,theMES,andthePFU/EFUReview resultswereusedtocorrectformeasurementerrorinthe enumeration,residency,mover,andmatchstatusesforsubsamplesoftheFullA.C.E.,calledtheRevisionEandPsamples.Noadditionaldatawerecollectedinthismea-surementerrorcorrectionprocess.TheRevisionSamplesunderwentextensiverecodingusingallavailabledataindicatedabove.Thisrecodingincludedtheoriginalinterviewandmatchingresults,theevaluationinterviewandmatchingresults,aswellastherecoding doneforthePFU/EFUReview.TheA.C.E.RevisionIIrecodingoperationwasanextensionofthePFU/EFUReviewclericalrecoding,whichwasused toexaminediscrepanciesbetweenenumerationstatusinSectionIIChapter22-1SummaryofA.C.E.RevisionIIMethodologyU.S.CensusBureau,Census2000 theoriginalA.C.E.andtheEvaluationFollow-up(EFU).Giventheinformationavailable,therecodingthatwas doneonthe17,500casesintheReviewEsamplewas consideredtohavenegligibleerror,sincethesedatawere reviewedandrecodedbyexpertmatchersusingrules consistentwithcensusresidencerules.AnautomatedcodingalgorithmbasedonspecificresponsestothePFUandtheEFUquestionnaireswasused todetermineanappropriatecodeforeachcase.ThiswasdoneforboththePFUinterviewandtheEFUinterview.TheautomatedcodingalsoassignedaWhycodethat describedthereasonwhytheparticularcodewas assigned.Athree-stepprocesswasfollowedtoassignfinalcodestoeachcase: 1.Validation.Determine,forcategoriesofWhycodes,iftheautomatedcodingwasofhighqualitybasedon levelofagreementwiththeReviewdata. | |||
2.Targeting.TargetonlythoseWhycodecategoriesthathadcodesproducedbyautomatedcodingthathadlowlevelsofagreementwiththeReviewdata. | |||
3.Clericalcoding.ClericallyrecodeonlycasesinthetargetedWhycodecategories.Theclericalrecodingtookadvantageofhandwritteninterviewercomments.Ingeneral,casesdidnotgotoclericalreviewifboththePFUandEFUautomatedcodesagreed,themoverstatuses alsoagreed,andtheWhycodecategorywasdeemedtobeofhighenoughquality.AftertheA.C.E.RevisionIIrecodingoperationcorrectedforenumeration,residency,andmoverstatus,theresultsoftheMESwereusedtocorrectforfalsematchesand falsenonmatches.Somematchingerrorswerearesultof incorrectresidencystatuscodingandhadbeencorrectedaspartoftherecodingoperationdiscussedabove.Todeterminethecorrectmatchstatus,eachofthepossible combinationsofmatchstatuswasreviewedtodeterminetheappropriatematchstatusforeachtypeofcase.Ingen-eral,theMESmatchstatuswasassignedwhentherewere changesfromamatchtoanonmatchorchangesfromanonmatchtoamatch.ForothersituationsthematchstatusfromtheEFUcodingwasassigned.SeeKrejsaand Adams(2002)forfurtherdetails.ADJUSTMENTFORMISSINGDATAAswithallsurveydata,itisnotpossibletoobtaininter-viewsforallsamplecases,norisitpossibletoobtain answerstoallinterviewquestions.FortheFullA.C.E.EandPsamples,householdnoninterviewadjustmentswereusedtoadjustfornoninterviewedhouseholds.Inaddition, imputationmethodswereusedtoadjustformissingchar-acteristicssuchasageortenure,aswellasenumeration,residency,andmatchstatus.FortheA.C.E.RevisionIIwork,thesemissingdataadjustmentsfortheFullA.C.E.EandPsampleswereessentiallyunchangedfromthose usedtoproducetheMarch2001A.C.E.estimates.FortheRevisionEandPsamples,however,therewerethreenewtypesofmissingdatatodealwith:1.Noninterviewedhouseholds:RevisionP-samplehouse-holdsthatwereconsideredinterviewsintheA.C.E.Psample,butwereidentifiedasnoninterviewsinthe RevisioncodingbecauseitwasdeterminedthattherewerenovalidCensusDayresidents;2.RevisionE-orP-samplecaseswithunresolvedmatch,enumeration,orresidencystatusbecauseofincom-pleteorambiguousinterviewdata;3.RevisionE-orP-samplecaseswithconflictingenu-merationorresidencystatus.Thisoccurredwhencon-tradictoryinformationwascollectedintheA.C.E.PFUandtheEFUinterviewsanditcouldnotbedetermined whichwasvalid.HouseholdNoninterviewAdjustmentfortheRevisionPSampleFortheoriginalMarch2001A.C.E.estimates,thehouse-holdnoninterviewadjustmentgenerallyspreadtheweightsoftheFullP-samplenoninterviewedhousingunitsoverinterviewedhousingunitsinthesameblockcluster withthesamehousingunitstructuretype.Themethodol-ogyfortheRevisionP-samplehouseholdnoninterviewadjustmentforInterviewDaywasessentiallyunchanged fromthatusedfortheFullPsample.Therewas,however, animportantchangeforthenoninterviewadjustmentforCensusDayresidency.Aseparatecellwasdefinedfornewnoninterviewsduetowholehouseholdsofpersonsdeter-minedtobeinmoversornonresidentoutmoversbasedontherecodingthatwasdonetocorrectformeasurementerror.ImputationforRevisionE-orP-SampleUnresolvedCasesIntheFullA.C.E.Psample,personswithunresolvedCen-susDayresidencyormatchstatuscameaboutintwo ways.First,thepersoninterview(PI)maynothavepro-videdsufficientinformationformatchingandfollow-up.Second,thePersonFollow-up(PFU)maynothavecollectedadequateinformationtodetermineapersonsCensusDay residencystatusortheirmatchstatus.Theimputationmethoddifferedbyhowthecasecametobeunresolved.RevisionP-samplepersonswithinsufficientinformationformatchingandfollow-uptendedalsotohavehadinsuf-ficientinformationintheoriginalcodingoftheFullPsample,exceptforsomerarecodingchanges.Theseper-sonswithinsufficientinformationwerenotsentoutforanEvaluationFollow-upinterview.2-2SectionIIChapter2SummaryofA.C.E.RevisionIIMethodologyU.S.CensusBureau,Census2000 FortheRevisionPsample,theimputationofCensusDayresidencywasimproveduponbydefiningfinerimputation cellsthatincludedwhetherornotthehousingunitwas matched,notmatched,orhadaconflictinghousehold. | |||
Theprobabilityofamatchwasimputedbasedonthe overallmatchrateforfivegroupsdefinedbymoversta-tus,housingunitmatchstatusasintheoriginalA.C.E., | |||
andalsoonconflictinghouseholdstatus.ForRevisionP-andE-samplepersonswhowereunre-solvedbecauseofambiguousorincompletefollow-up information,thesituationwasmorecomplicatedbecause thereweretwofollow-upinterviewstoconsider,thePFU andEFU.FortheFullEandPsamples,imputationcellswerebasedmostlyoninformationobtainedbeforeanyfollow-upwas conducted.FortheRevisionEandPsamples,imputationcellsreliedontheafterfollow-upinformation.Thischangewasthesinglemostimportantimprovementinthemiss-ingdatamethodology.ImputationforRevisionE-orP-SampleConflictingCasesWhentheA.C.E.PFUandEFUinterviewshadcontradictoryinformation,thecasewasassignedacodeofconflicting. | |||
Allcasesdeterminedtobeconflictingbasedonthe automatedrecodingweresenttoanalystsforfurtherclericalreview.Byexaminingthehandwrittennotesofinterviewers,theanalystscouldoftendeterminewhichof theinterviewswasbetterandassignanappropriatecode.Thereweresomecaseswheretheinterviewsappearedtobeofequalquality,suchasbothrespondentswerehouse-holdmembersorbothrespondentswereofequalcaliberproxy.Fortheseconflictingcases,theinterviewsseemedequallyvalidbasedontheexpertiseoftheanalysts. | |||
Therefore,probabilitiesof0.5wereimputedforcorrect enumerationforRevisionE-sampleconflictingcasesandforCensusDayresidencyforRevisionP-sampleconflicting cases.FURTHERSTUDYOFPERSONDUPLICATIONEarlierworkshowedthatcorrectingmeasurementerrorbyrecodingwasnotgoingtocorrectallthemissederrone-ousenumerations.EvaluationsoftheMarch2001A.C.E.coverageestimatesindicatedtheA.C.E.failedtodetectalargenumberoferroneouscensusenumerations.Onetype ofcensuserroneousenumerationswasduplicatecensusenumerations;thatis,censusenumerationsincludedinthecensustwoormoretimes.TheA.C.E.wasnotspecifi-callydesignedtodetectduplicatecensusenumerations beyondtheA.C.E.searcharea(theareawherecensusandA.C.E.personmatchingwasconducted).However,therewasanexpectationthattheA.C.E.woulddetectthatthese E-sampleenumerationshadanotherresidenceandthatroughlyhalfthetimethisotherresidencewastheusualresidence.Feldpausch(2001)showedthisexpectationwas notmet.ForpurposesofconstructingA.C.E.RevisionIIestimates,thestudyofpersonduplicationusedmatchingandmodel-ingtechniquestoidentifyduplicatelinksbetweentheFull EandPsamplestocensusenumerations.Linkstogroup quarters,reinstated,deletedandE-sampleeligiblerecords throughouttheentirenationwereallowed.Thematching algorithmusedstatisticalmatchingtoidentifylinked records.Statisticalmatchingallowedforthematching variablesnottobeexactonbothrecordsbeingcompared. | |||
Becauselinkedrecordsmaynotrefertothesameindi-vidualevenwhenthecharacteristicsusedtomatchthe recordswereidentical,modelingtechniqueswereusedto assignameasureofconfidence,theduplicateprobability, thatthetworecordsrefertothesameindividual.MatchingAlgorithmThematchingalgorithmconsistedoftwostages.Thefirststagewasanationalmatchofpersonsusingstatisticalmatching.Statisticalmatchinglinksrecordsbasedonsimi-larcharacteristicsorcloseagreementofcharacteristics. | |||
Statisticalmatchingallowedtworecordstolinkinthepresenceofmissingdataandtypographicalorscanningerrors.Thesecondstageofmatchingwaslimitedto matchingpersonswithinhouseholdsthatcontainedalink fromthefirststage.Thesecondstageofmatchingwaslimitedtomatchingpersonswithinlinkedhouseholds.Thefirststageestab-lishedalinkbetweentwohousingunits.Thesecondstage wasastatisticalmatchofallhouseholdmembersinthe samplehousingunittoallhouseholdmembersinthecen-sushousingunit.ModelingTechniquesThesetoflinkedrecordsconsistsofbothduplicatedenu-merationsandpersonrecordswithcommoncharacteris-tics.Usingtwomodelingapproaches,theprobabilitythatthelinkedrecordswerethesamepersonwasestimated.Oneapproachusedtheresultsofthestatisticalmatching andreliedonthestrengthofmultiplelinkswithinthehouseholdtoindicatepersonduplication.Thesecondreliedonanexactmatchofthecensustoitselfandthe distributionofbirths,names,andpopulationsizetoindi-cateiftheindividuallinkwasaduplicate.Thesetwoapproacheswerecombinedtoyieldanestimatedduplicateprobabilityforthelinkedrecordsfromthestatistical matchingoftheFullEandPsamplestothecensus.SeeChapter5forafulldiscussiononthepersonduplicationstudy.THEA.C.E.REVISIONIIDSEFORMULAWiththecorrectionofmeasurementerrorintheRevisionEandPsamples,theadjustmentformissingdataintheRevisionEandPsamples,andthedeterminationofcensusSectionIIChapter22-3SummaryofA.C.E.RevisionIIMethodologyU.S.CensusBureau,Census2000 duplicatelinksbetweentheFullEandPsamplesandcen-susenumerations,thedualsystemestimationformulacan beapplied.Thefollowingsectionsexplaintheformulaand itsadjustmentfortheA.C.E.RevisionIIwork.UsingprocedureCformoversanddifferentpost-stratafortheEandPsamples,theDSEformulacanbewrittenas: | |||
DSE C ijC en'ijII'ij[CE i E i][M nm , j[M om , j P om , j]P im , j P nm , jP im , j]TheA.C.E.RevisionIIDSEformulausingprocedureCformovers,separateEandPpost-strata,measurementerrorcorrectionsfromtheEandPRevisionSamples,anddupli-catestudyresultsis: | |||
R e DSE C ijC en'ijII'ij[CE i ND f 1, i'CEi D E i][M nm , j ND f 2, j'Mnm , j D[M om , j f 3, j'P om , j f 4, j']P im , j f 5, j'gP nm , j D-Pnm , j DP nm , j ND f 6, j'Pnm , j DP im , j f 5, j'gP nm , j D-Pnm , j D]RecallthattheII'termexcludesthelatecensusadds. | |||
NotationTerms CECorrectenumerations EE-sampletotal M Matches PP-sampletotal fAdjustsformeasurementerror gAdjustsnonmoverstomoversdueto duplication Subscriptsi,jFullEandPpost-stratai',j'RevisionEandPmeasurementerrorcorrectionpost-stratanm,om,imnonmover,outmover,inmover Superscripts CDSEProcedureCformovers NDNotaduplicatetocensusenumerationoutsidesearcharea DDuplicatetocensusenumerationoutsidesearch areaIncludesprobabilityadjustmentforresidencygivenduplicationAdjustmentforDuplicatesusingtheDuplicate StudyThefirsttaskwastoadjusttheusualdualsystemestimateformulaforthosecasesthathavealinktoacensusenu-merationoutsidetheA.C.E.searcharea.P-andE-sample caseswithlinkstocensusenumerationswereassignedanonzeroprobabilityofbeingaduplicate.P-andE-samplecaseswithoutduplicatelinkswereassignedaprobability ofzero.WhenestimatingtermsintheA.C.E.RevisionIIDSEinvolv-ingnonduplicates,thoseindicatedbyasuperscriptND,itwasnecessarytoincludetheprobabilityofnotbeingaduplicateinthetallies.Thisprobabilityofnotbeinga duplicatewasincludedinallofthetermsinvolvingtheND | |||
superscript.AlthoughtheduplicatestudyidentifiedE-andP-samplecaseslinkingtocensusenumerationsoutsidetheA.C.E. | |||
searcharea,thisstudycouldnotdeterminewhichcompo-nentofthelinkwasthecorrectonesincetherewerenoadditionaldatacollectedtodeterminethis.OntheE-sampleside,thisstudydoesnotidentifywhetherthe linkedE-samplecaseisthecorrectenumeration.OntheP-Sampleside,thisstudydoesnotidentifywhetherthelinkedP-samplecaseisaresidentonCensusDay.Thus,it wasnecessarytoestimatetwoconditionalprobabilities,whicharereflectedfortheEsamplein CEi D.InthePsample,theseprobabilitiesarereflectedinthenonmover terms Pnm , j D and Mnm , j D.AdjustmentforMeasurementErrorUsingtheRevisionEandPSamplesNext,anadjustmentismadeforothermeasurementerrorsnotaccountedforbytheduplicatestudy.Thisadjustmentwasappliedonlytononduplicatetermstoavoidover-correctionduetoanyoverlapbetweenthe duplicatestudyandcorrectionofmeasurementerror.InsupportoftheA.C.E.RevisionIIprogram,theRevisionSampleshaveundergoneextensiverecodingusingall availableinterviewdataandmatchingresults.Missing dataadjustmentshavealsobeenappliedtotheRevisionSamples.ThisrecodeddatafromtheRevisionSampleswereusedtocorrectformeasurementerrorintheoriginal FullEandPsamples.TheratioadjustmentsthatcorrectformeasurementerrorwerebasedontheEorPRevisionSampleandwerearatio ofanestimateusingtheRevisioncodingtotheestimateusingtheoriginalcoding.Theseadjustmentsweredonebymeasurementerrorcorrectionpost-strata i'or j'andaredenotedbythe ftermsintheA.C.E.RevisionIIDSEfor-mula.Theterm gadjuststhenumberofinmoversforthoseFullP-samplenonmoverswhoaredeterminedtobenonresi-dentsbecauseofduplicatelinks.Someofthesenonresi-dentsarenonresidentsbecausetheyareinmoversandshouldbeaddedintothecountofinmovers.Theterm P nm , j D-Pnm , j Disanestimateofnonresidentsamongnonmov-erswithduplicatelinks.AdjustmentforCorrelationBiasUsingDemographicAnalysisNext,theA.C.E.RevisionIIDSEestimatesareadjustedtocorrectforcorrelationbias.Correlationbiasexistswhen-evertheprobabilitythatanindividualisincludedinthe2-4SectionIIChapter2SummaryofA.C.E.RevisionIIMethodologyU.S.CensusBureau,Census2000 censusisnotindependentoftheprobabilitythattheindi-vidualisincludedintheA.C.E.Thisformofbiasgenerallyhasadownwardeffectonestimates,becausepeople missedinthecensusmaybemorelikelytoalsobemissed intheA.C.E.Estimatesofcorrelationbiasarecalculated usingthetwo-groupmodelandsexratiosfromDemo-graphicAnalysis(DA).Thesexratioisdefinedasthenum-berofmalesdividedbythenumberoffemales.This modelassumesnocorrelationbiasforfemalesorfor malesunder18yearsofage;andthatBlackmaleshavea correlationbias,whichisdifferentthantherelativecorre-lationbiasfornon-Blackmales.Thecorrelationbias adjustmentisalsodonebythreeagecategories:18-29, 30-49,and50andover.Thismodelfurtherassumesthat relativecorrelationbiasisconstantovermalepost-strata withinagegroups.TheRace/HispanicOriginDomainvari-ableisusedtocategorizeBlackandnon-Black.TheDAtotalsareadjustedtomakethemcomparablewithA.C.E.Race/HispanicOriginDomains.BlackHispanicsare subtractedfromtheDAtotalforBlacksandaddedtothe DAtotalfornon-Blacks.ThisisdonebecausetheA.C.E.assignsBlackHispanicstotheHispanicdomain,nottheBlackdomain.Thesecondadjustmentdeletesthegroup quarters(GQ)peoplefromtheDAtotalsusingCensus2000data.ThereasonformakingthisadjustmentisthattheGQpopulationisnotpartoftheA.C.E.universe.A finaladjustmentthatcouldhavebeenmadewouldhavebeentoremovetheremoteAlaskapopulationfromtheDAtotals,sinceittooisnotpartoftheA.C.E.universe.Sincethispopulationissmall,theDAsexratioswouldnotbe affectedinanymeaningfulway.SeeU.S.CensusBureau (2003)fortechnicaldetails.SYNTHETICESTIMATIONThecoveragecorrectionfactorsfordetailedpost-strataijwerecalculatedas: | |||
CCF ijRe DSE C ij C en ijwhere: Re DSE ij CsarethecorrelationbiasadjustedDSEsfor post-strata ij.C en ijsarethecensuscountsforpost-strata ij ,includinglatecensusadds.Acoveragecorrectionfactorwasassignedtoeachpost-stratum.Thepost-strataexcludedpersonsingroupquar-tersorinremoteAlaska.Effectively,thesepersonshaveacoveragecorrectionfactorof1.0.Indealingwithduplicatelinkstogroupquarterspersons,thepersoninthegroup quarterswastreatedasif(s)hewasacorrectenumeration orasifthiswastheircorrectresidenceonCensusDay.Asyntheticestimateforanyareaorpopulationsubgroupbisgivenby: | |||
Nbijb C en b , ij CCF ijSectionIIChapter22-5SummaryofA.C.E.RevisionIIMethodologyU.S.CensusBureau,Census2000 Chapter3.CorrectingDataforMeasurementError INTRODUCTIONTheoriginalA.C.E.estimateswerefoundtobeunaccept-ablebecausetheyfailedtodetectsignificantnumbersof erroneouscensusenumerations.Therewerealsosuspi-cionsthattheA.C.E.mayhaveincludedresidentsinitsPsamplethatwereactuallynonresidents.Thus,themajorgoalfortheA.C.E.RevisionIIestimatesincludesacorrec-tionofthesemeasurementerrors.Oneaspectofthese correctionsinvolvescorrectingasubsampleoftheA.C.E.data.Anotheraspectinvolvescorrectingmeasurementerrorsthatcannotbedetectedwiththeinformationavailableinthesubsample.Theseadditionalerrors,which areidentifiedviaaduplicatestudy,arediscussedinChapter5.Tounderstandthemeasurementerrorcorrectionprocess,itisimportanttobefamiliarwiththevarioussourcesofavailableinformation.Thesearesummarizedinthefollow-ingtable.TheA.C.EestimatesproducedinMarch2001werebasedontheFullEandPsamples,whichareprobabilitysamplesofover700,000personsin11,303blockclusters.TheMatchingErrorStudy(MES)andtheEvaluationFollow-up (EFU)weretwoprogramsthathadbeenplannedtoevalu-atetheMarch2001A.C.E.estimates.Theseevaluationswereconductedinasubsampleof2,259blockclusters selectedfromtheoriginal11,303blockclusters.AfurthersubsampleofpersonswithintheseblockclusterswasdonefortheEFUevaluation.TheprobesusedforEFUwere designedtocaptureunusuallivingsituations.The PFU/EFUReviewwasnotpartoftheplannedevaluations.ItwasconductedinordertoresolvemajordiscrepanciesinenumerationstatusbetweentheEFUandPFUresults. | |||
Thus,theReviewEsampleisasubsampleoftheEFUEsample.TheRevisionEandPsamplesarereferredtoassuchforpurposesofproducingA.C.E.RevisionIIesti-mates.ThesesamplesareessentiallythesameastheEvaluationEandPsamplesforEFU,butthedatahaveundergoneamajorrecodingtocorrectformeasurementTable3-1.OverviewofA.C.E.RevisionIIDataSourcesProgramSampleSamplesizeWhat&whenDecennialcensusSpring2000A.C.E.FullEandPsamplesE&P:About700,000personsin11,303blockclustersA.C.E.PersonInterviewing(PI),Summer2000A.C.E.PersonFollow-up(PFU),Fall2000MatchingErrorStudy (MES)EvaluationEandPsamplesE&P:About170,000personsin2,259blockclustersRematchingOperation,December2000EvaluationFollow-up (EFU)EFUEandPsamples 1E:About77,000personsin2,259blockclustersEvaluationPersonFollow-up(EFU),January-February,2001P:About61,000personsin2,259blockclustersPFU/EFUReviewReviewEsampleE:About17,500personsin2,259blockclustersRecodingOperation,Summer2001A.C.E.RevisionIIRevisionEandPsamplesE:About77,000personsin2,259blockclustersRecodingOperation,Summer2002P:About61,000personsin2,259block clusters 1ThenumberofsamplecasesincludedintheEvaluationFollow-upislessthanthoseselectedtobeinthissample.Caseswereexcludedfromfollow-upforcertainsituationssuchasinsufficientinformationoraduplicateenumeration.SectionIIChapter33-1CorrectingDataforMeasurementErrorU.S.CensusBureau,Census2000 error.Thischapterdiscussesthemeasurementerrorcor-rectionsmadetotheE-andP-Revisionsamples.These correcteddata,alongwithothermeasurementerrorcor-rectionsidentifiedbytheduplicatestudy,wereusedto adjusttheFullEandPsamplestoproduceA.C.E. | |||
RevisionIIestimates.GOALSANDBACKGROUNDThegoalforA.C.E.RevisionIIwastocorrectasmuchmea-surementerroraspossibleintheoriginalA.C.E.estimates,givenresourceandtimingconstraints. | |||
2Theprimarysourcesofmeasurementerrorweredeterminingresidence andenumerationstatus,matchstatus,andmoverstatus.ResidenceandEnumerationStatus.TheoriginalA.C.E.didnotdetectalloftheerroneousenumerations.SeeAdamsandKrejsa(2001)andFay(2002)fordocumen-tation.TheEvaluationFollow-up(EFU)detectedapproxi-mately1.4millionadditionalerroneousenumerationsintheEsample.Sincethecodingofenumerationstatusin theEsamplewasidenticaltothecodingofresidencesta-tusinthePsample,similarresultsforP-sampleresidencestatuscodingwereexpected(i.e.,additionalnonresidents wereexpectedtobefoundasaresultoftheEFU).Tocor-rectfortheresidencestatuserrors,theA.C.E.RevisionIIutilizedarecodingoftheEvaluationFollow-upInterviewincombinationwiththeoriginalA.C.E.todeterminethebest residenceorenumerationstatusforeachpersonintheRevisionsample.MatchingError.TheMatchingErrorStudyshowedanetdifferenceinmatchcodesbetweentheoriginalMarch2001matchingresultsandtheevaluationmatching resultsof0.41percentintheEsampleand0.20percentinthePsample.Bean(2001)suggestedthisnetdifferencetranslatedintoanincreaseinthedualsystemestimateof 483,938people.Tocorrectformatchingerror,resultsof theMatchingErrorStudyandtheA.C.E.RevisionIIrecod-ingwereusedinconjunctiontodeterminetheappropriatematchstatusforeachperson.MoverStatus.RaglinandKrejsa(2001)estimateda2.6percentgrossdifferencerateinthemoverstatusbetween theoriginalA.C.E.andtheEvaluationFollow-up.Thistranslatedintoanegativebiasof465,000intheDSE(assumingnootherbiases).ResultsoftheEvaluation Follow-upwereusedtocorrectformoverstatuserrors. | |||
TheEFUquestionnairecontainedquestionsdesignedtoprobeforapersonsmoverstatus.Thisinformationwascapturedduringtheclericalrecodingandduringtheinitial codingoftheEvaluationFollow-upform.Thesetypesofmeasurementerrorswerecorrectedeitherbycomputerorclerically.Twoothersourcesoferrorwerenotpartofthemeasure-menterrorrecodingportionoftheA.C.E.RevisionII. | |||
Theseerrorsincludedgeocodingerrorsandduplicates outsidethesearcharea.Certaingeocodingerrorsdetected byvariousgeocodingevaluationswerenotincludedinthe A.C.E.RevisionII. | |||
3WithinthePsample,245,926produc-tionnonmatchedresidentswerefoundoutsidethesearch area 4and195,321productioncorrectenumerationsintheEsamplewerefoundoutsidethesearcharea.SeeAdamsandLiu(2001).Someofthecorrectenumerationsoutside thesearchareawereidentifiedbytheEFUinterviewand,hence,werereflectedintherevisedcoding. | |||
5 Duplicatesfoundoutsidethesearchareaasaresultofcomputer matching(seeChapter5)werenothandledbyclericalcod-ing.Theywereaccountedforinthedualsystemestimatorusingestimationtechniques.SeeChapter6forafull descriptionoftheestimationtechniques.RESIDENCESTATUSANDENUMERATIONSTATUSAsalreadynoted,theoriginalMarch2001A.C.E.underes-timatedthenumberoferroneousenumerations.Tocorrectforthis,thebestresidencestatuscodewasbasedonavailablefieldfollow-updata.Duplicateswerecorrected usingaseparateprocess.Thefollowingdatawereavail-ableformeasurementerrorcorrection: | |||
*PersonInterview(PI).ThePIwastheoriginalA.C.E.enumerationofthePsample.ItwasaComputer-AssistedPersonalInterviewquestionnairedesignedtofullyenumeratepersonsintheA.C.E.Itwasconducted byeitherphoneorpersonalvisitbetweenApriland September,2000. | |||
*PersonFollow-up(PFU).ThePFUwasthefollowupusedtoassignresidenceandenumerationstatus,when-everthoseitemswerenotdetermined,afterthebeforefollow-upmatching(Childers,2001).ItwasconductedbypersonalvisitinOctoberandNovember,2000, approximately6-7monthsafterCensusDay. | |||
*EvaluationFollow-up(EFU).TheEFUwasanevalua-tionoftheA.C.E.designedtodetectunusuallivingsitu-ationsusingadditionalprobesandadditionalinterview-ingtechniques(e.g.,flashcards).ItwasconductedbypersonalvisitinJanuaryandFebruary,2001,approxi-mately9-10monthsafterCensusDay. | |||
2InordertocompletetheA.C.E.RevisionIIestimatesontime,12weekswereallottedforcoding.AnalystsattheNationalPro-cessingCenterwereexpectedtocodeapproximately25,000casesinthistimeframe. | |||
3AspartoftheA.C.E.,severalevaluationsofgeocodingerrorwereconductedonvarioussubsamplesoftheA.C.E.,mostnota-blyTargetedExtendedSearch2(TES2)andTargetedExtended Search3(TES3).ResultsoftheseevaluationscanbefoundinAdamsandLiu(2001). | |||
4Forthe2000A.C.E.,thesearcharea,orareainwhichaper-soncanbeconsideredacorrectenumerationormatch,wasthe clusterandanycensusblocktouchingthecluster. | |||
5SomeofthecasesinTES2wereevaluatedusingtheEvalua-tionFollow-upquestionnaire.Forthesecases,resultsofthegeoc-odingevaluationwereincludedintheEvaluationFollow-up;how-ever,ifacasewasinTES2,butnotintheEvaluationFollow-up,nogeocodingevaluationresultswereincluded.3-2SectionIIChapter3CorrectingDataforMeasurementErrorU.S.CensusBureau,Census2000 ResultsofthePersonInterviewwereusedtoassignA.C.E.residencestatusbycomputertoallpeopleinA.C.E.who didnotneedfollow-up.Incontrast,thePFUwasusedto assignresidencestatusforanyonewhowaseligiblefor follow-up(Childers,2001).ThePFUissimilartothePI. | |||
ThePFUprocessinterviewedbothP-sampleandE-sample people.TheEFUfollowedupasampleofpeoplesentto PFUandasampleofthosenotsenttoPFU.Thisallowed theresidence/enumerationstatusofarepresentative sampleofpeopleeligibleforfieldfollowuptobe | |||
evaluated.ThereweremeasurementerrorsinboththeA.C.E.PFUandEFUresultingfromlimitationsoftheirrespectiveinter-views.TheseerrorsaredocumentedinBean(2001)andAdamsandKrejsa(2001),respectively.Also,theEFUwas notstrictlycodedaccordingtocensusresidencerules.ToevaluatetheEsampleforESCAPII,theCensusBureaucon-ductedthePFU/EFUReviewinthesummerof2001.Expert matchersreviewedasubsampleoftheEFUEsampleand appliedconsistentcensusresidencyrules.Theseanalystswereassumedtomakenegligibleerrors;therefore,thePFU/EFUReviewwasconsideredtobefreeofcodingerror, givenavailabledata.ForA.C.E.RevisionII,thishigh-qualitycodingwasneededforsubsamplesoftheA.C.E.PandEsamplesthatwerelargeenoughtoprovideaccuratesubgroupestimatesof netcoverage.Twelveweekscodingtimewereallottedtoclericallycodeapproximately25,000cases.However,therewereover100,000casesneedingcodes.Toassign thehighestqualitycodes,whilemeetingademanding schedule,keyeddatafromboththePFUandEFUformswereusedtoaugmentclericalcodingprocedures.Anautomatedcodingalgorithm,basedonspecificresponses tothePFUandEFUquestionnaires,wasusedtodetermineanappropriatecodeforeachcase.ThiswasdoneforboththePFUinterviewandtheEFUinterview.Theautomated codingalsoassignedaWhycodethatdescribestherea-sonwhytheparticularcodewasassigned.Thereweremorethan60possibleWhycodecategories.Afinalcode wasassignedtoeachcaseusingthefollowingthree-step process:*Validation.DetermineforeachcategoryofWhycodeiftheautomatedcodingisofhighqualityusingthePFU/EFUReviewasatruthdeck. | |||
*Targeting.TargetonlythoseWhycodecategoriesthathavelowlevelsofagreementbetweentheautomatedcodingandthePFU/EFUReviewdata. | |||
*ClericalReview.ClericallyrecodeonlythosecasesinthetargetedWhycodecategories.Theclericalrecoding takesadvantageofhandwritteninterviewercomments.ValidationofKeyedDataTovalidatethequalityofcodingproducedbythekeyeddataalgorithm,skippatternsforbothquestionnaireswere programmedtodetermineanappropriatematchcodeandWhycodeforeachcase.Then,forboththePFUandEFUforms,thepercentageagreementwiththeoriginalcoding (eitherproductioncodingorthecodingoftheEFUform) fortherespectiveform,thepercentageagreementwith thePFU/EFUReview,andtheresidualriskwereexamined. | |||
Thatis,thefollowingcalculationswereperformedtwice-onceforPFUandonceforEFU.Theresidualriskofdisagreement(i.e.potentialbias)rep-resentedthenumberofcasesatriskforbeingcoded wrongduetoacceptingtheautomatedcodeforcategoriesdefinedbyquestionnaireresponses.Casessubjecttoriskwerethosewheretheautomatedcodeandoriginalcode agreed.Iftheydisagreed,theautomatedcodewasrejectedandthecasewassentforclericalreview.Therisk forthecasesagreeingiscalculatedasfollows:risk=Agree KAgree R evwhereAgree K=Theweightednumberofcaseswhosecodefromthekeyeddataagreedwiththeoriginalproduc-tioncode.Agree R ev=Ofthosecaseswherethecodefromthekeyeddataagreedwiththeoriginalproductioncode,theweightednumberofcaseswhosecodefromthekeyeddataagreedwiththePFU/EFUReviewcode.Thetermrisk,ratherthananerror,isusedbecausesomepotentialcodingchangesmaynothavehadaneffectontheDSE.Forexample,peoplewhowereingroupquartershavearesidualriskof26,517aftercomputercoding. | |||
Theserepresentcasesthatprobablyshouldhavebeencodedaserroneousenumerations,butwerenot.However,someofthe26,517casescouldbeunresolved,which haveaprobabilitylessthanoneofbeingcorrect.TheautomatedcodingresultsforagivenWhycodecat-egorywererejectediftheresidualriskwastoohighoriftherewerenotenoughcasestomakeaninformeddeci-sion.Theexceptiontothisrulewasthecategoryconsist-ingofcaseswithoutanyindicationoflivinginagroup quartersorotherresidence.Thisgroupwas,byfar,thelargestcategoryforboththePFUandEFU,soahigherresidualrisk 6wasexpected.TargetingCasesforClericalReviewAfterthedecisionwasmadetoacceptorrejecttheauto-matedcodeforeachWhycodecategory,casesweretar-getedforclericalreview.Analysts,whowerethehighest levelofclericalmatchers,performedtheclericalreview. | |||
Duetotheirexperienceandadditionaltraining,theywereassumedtomakenegligibleerrorsincoding. | |||
6Absoluterisk,ratherthanrelativerisk,isused.Therefore,largercategoriestendedtohavehigherrisks.SectionIIChapter33-3CorrectingDataforMeasurementErrorU.S.CensusBureau,Census2000 Ingeneral,casesdidnotgotoclericalreviewifboththePFUandEFUautomatedcodesagree,themoverstatuses agree,andtheWhycodecategorywasdeemedtobeof highenoughquality.Insomeinstances,casesareexempt fromclericalreviewbecausetheycouldbecodedbased oninformationavailableindatafiles.Formanyofthese situations,consistentandcompletedatawereobtained fromboththePFUandEFUinterviews.Thesecases | |||
included:*CensusUsualHomeElsewhere.IfthepersonclaimedaUsualHomeElsewhereoncertaintypesofcensusforms,theywerecountedasacorrect 7 enumera-tionwithintheclusteranddidnotneedclericalreview. | |||
*GeocodingErrorsfromInitialHousingUnit Matching.IfacaseshouldnothavebeensenttoPFUorEFUandwasonlysentduetoclericalerrorintheini-tialproductionmatching,thenitdidnotneedclericalreview.Incontrast,somecasesareautomaticallysenttoclericalreview.Forexample,thisincludescasesinthePFU/EFUReviewthatresultedinaconflictingstatus,noninterviewcases,orcaseswheremoverdatescouldnotbedeter-minedfromtheEFUkeyeddata.SomeofthecasesthatwenttoclericalreviewdidsobecausetheoriginalA.C.EorPFUresultsdidnotagreewiththeEFUresults.Mostofthe caseswenttoclericalreviewbecausetheautomatedcod-ingprocesswasnotreliableforthatWhycodecategory.ForP-sampleinmovers,therewasnovalidationdata.CaseswheretheoriginalEFUmoverstatusdidnotmatchthemoverstatusfromthekeyeddata,ortheresidencestatusfromthekeyeddatadidnotmatchtheoriginalEFU residencestatus,weresenttoclericalreview.Noninter-viewcasesorcaseswheremoverdatescouldnotbedeterminedfromthekeyeddatawerealsosenttoclericalreview.Caseswiththefollowingattributesweresenttoclericalreview:*thecodefromthekeyeddataforeitherformwasnotacceptedforthatcase.*thecodefromthekeyeddatawasacceptedforbothforms,butatleastoneofthecodesfromthekeyeddatadidnotagreewithitsoriginalcode(i.e.,thePFUcode fromthekeyeddatadidnotagreewithproductionor theEFUcodefromthekeyeddatadidnotagreewiththeoriginalEFUcode).*forP-samplepeople,themoverstatusfromthekeyeddatadidnotagreewithmoverstatusassignedduring theEFUcoding.*therewaswrite-ininformationinopen-endedquestionsontheformthatcouldnotbecoded.*thecasewasapossiblematchinbeforefollow-upmatchingandtheproductionandoriginalEFUcodedis-agreed.*thecasewasaduplicateineithertheoriginalEFUcod-ingorproductionafterfollow-upcoding.*thecasewasnotyetflaggedforclericalreviewandthePFUcodefromthekeyeddatadidnotagreewithEFUcodefromthekeyeddata,andoneofthecaseswasnotunresolvedforcertainreasons.*thecasewasinthePFU/EFUReviewandwasconflictingorhadamoverstatusdisagreementbetweenthekeyed dataandtheoriginalEFUmoverstatus.ClericalReviewTheclericalreviewforA.C.E.RevisionIIwasananalyst-onlyoperation.Thefollowingdatawerecollected:*MatchCodeforeachform | |||
*WhyCodeforeachform*Respondentforeachform*Whethertherespondentsarethesameforthetwointer-views*BestCode.Acodeindicatingwhichformisthebetterofthetwoforms*SmooshedCode.Informationfrombothformscom-binedtomakeacodetorepresentthetruesituation*MoverStatus.MoverStatusfromtheEFUformforP-samplepeopleThematchcodeswereassignedusingthecensusresi-dencerulestoconstructcodingrulesfortheflowofthe questionnaire.Thebestcodecouldbeoneoffourvalues:*Both=Theenumerationstatuseswerethesame*PFU=ThePFUformprovidedbetterinformation*EFU=TheEFUformprovidedbetterinformation*Conflicting=Similarcaliberrespondents(e.g.,husbandandwife;twoneighbors)providedcontradictoryinfor-mationforthecaseToensurereproducibility,computereditswereappliedtothebestcode.Iftheanalystdidnotfollowpre-specified rules,thentheanalysthadtoreviewthecaseagainorleaveanoteindicatingthesituation. | |||
7Apersoncanclaimausualhomeelsewhereifheorsheisenumeratedoncertaintypesofcensusformsingroupquarters (e.g.military,shipboard,andcertaintypesofspecialplaceslike shelters).Ifapersonononeoftheseformsclaimsausualhomeelsewhere,thenthatpersoniscountedattheaddresstheyindi-cateistheirusualhome.ThesepeoplearepartoftheEsample becausetheyarepartofthehousingunituniverse.3-4SectionIIChapter3CorrectingDataforMeasurementErrorU.S.CensusBureau,Census2000 CORRECTIONOFMOVERSTATUSASSIGNMENT ERRORSForeachP-samplecase,moverstatuswasbasedontheEFU.Thiswasusedtodeterminewhetherornottheper-sonneededclericalreview.CORRECTIONOFMATCHINGERRORSAftertheA.C.E.RevisionIIrecodingoperationcorrectsforenumeration,residence,andmoverstatus,theresultsoftheMatchingErrorStudy(MES)wereusedtocorrectforfalsematchesandfalsenonmatches.Somematching errorswerearesultofincorrectresidencestatuscodingandhavebeencorrectedaspartoftherecodingoperationdiscussedabove.Todeterminethecorrectmatchstatus, eachofthepossiblecombinationsofmatchstatuswas reviewedtodeterminetheappropriatematchstatusforeachtypeofcase.Ingeneral,theMESmatchstatuswasassignedwhentherewerechangesfromamatchtoanon-matchorchangesfromanonmatchtoamatch.Forothersituations,thematchstatusfromtheEFUcodingwas assigned.DATAOUTPUTSAftertheclericaloperationwascompleted,twofileswereassembled-oneforthePsampleandanotherfortheE sample.ThefilescontainmatchcodesandWhycodes(whereappropriate)fororiginalMarch2001A.C.E.,EFU,PFU/EFUReview,KeyedData,andA.C.E.RevisionIICleri-calReview.Afinalcodeisalsoassignedinthefollowinghierarchy:A.C.E.RevisionIIClericalReview,PFU/EFUReview,KeyedData.Thiscodereflectsthefinalmatch, residence,andenumerationstatusfortheA.C.E.Revision IIprocess.LIMITATIONSTherewereseverallimitationsonthedatafortheA.C.E.RevisionII: | |||
*SampleSize.Thesampleusedtoestimatemeasure-menterroris2,259clusters,containingabout10per-centofthepersonsinthesampleusedintheproduc-tionA.C.E.Duetothesmallersamplesize,somesubgroupestimatesaresubjecttohighervariancescomparedtothosefortheoriginalMarch2001A.C.E. | |||
*ConflictingCases.ConflictingcasesoccurredwhenthePFUandEFUinterviewshadrespondentsofthe samecaliber(eitherbothnonproxyorproxyrespon-dentswhowereinthepositiontohavesimilarknowl-edgeaboutthehousehold,e.g.twoneighbors)who gavecontradictoryinformation.Sinceanadditionalfieldfollow-upwasnotpossible,thesecaseswerecodedasconflicting,werereviewedseparately,andimputed. | |||
*DataCollectionError.Caseswerecodedasbestaspossible.However,therewasnoattempttocorrectforanyresidualdatacollectionerror.Anyremainingrespon-dentandinterviewererrorscouldnotberectifiedwith-outanadditionalfieldfollow-up.SectionIIChapter33-5CorrectingDataforMeasurementErrorU.S.CensusBureau,Census2000 Chapter4.A.C.E.RevisionIIMissingDataMethods BACKGROUNDMissingdataarisesbecauseitisnotpossibletoobtaininterviewsforallsamplecasesortoobtainanswerstoallinterviewquestions.ThiswasastruefortheA.C.E.Revi-sionII,asitwasfortheA.C.E.ToputtheA.C.E.RevisionIImissingdatamethodsinperspective,abriefsummaryoftheA.C.E.missingdataadjustmentsispresented.Forthe A.C.E.Psample,ahouseholdnoninterviewadjustmentcompensatedfornoninterviewedhouseholds.Imputation methodswereimplementedtohandlemissingcharacteris-ticssuchasageortenure.Further,matchandresidencyprobabilitieswereassignedwhentherespectivematchandresidencystatusescouldnotbedefinitivelydeter-mined.Therewasnononinterviewadjustmentforthe A.C.E.Esample,norwasthereanimputationformissingcharacteristicsasthecensusimputationswereused.How-ever,E-samplecaseswithunresolvedenumerationstatus wereassignedprobabilitiesofcorrectenumeration.SeeIkedaandMcGrath(2001)fordetailsontheA.C.E.missingdatamethodology.AswillbediscussedinChapter6,theA.C.E.RevisionIIestimationutilizesboththeoriginalA.C.E.codingresultsontheFullEandPsamplesandtheRevisioncodingresultsonthesmallerRevisionSamples.Notethatthe A.C.E.RevisionIIsubsampleoftheA.C.E.isreferredtoastheRevisionSampleandthenewcodingoperationiscalledtheRevisioncoding.Themissingdataadjustments fortheA.C.E.EandPsampleswereunchangedfromthose usedtoproduceA.C.E.estimates,withtheexceptionoftheimputationformissingage.ItwasnecessarytoimputeageagainfortheFullA.C.E.PsamplebecausetheA.C.E. | |||
RevisionIIpost-stratahaddifferentagegroupings.TheRevisionPsampleusedthesameimputationsformissingcharacteristicsthattheA.C.E.did,includingthenewageimputation.However,sinceA.C.E.RevisionII measurementmethodologyhadimportantdifferencesfromtheA.C.E.measurementmethods,itwasnecessarytodevelopnewmissingdatamethods.TheA.C.E.Revision IImissingdataconfrontedthreegeneraltypesofnew missingdataproblems:1.Newnoninterviewedhouseholds:RevisionP-samplehouseholdsthatwereconsideredinterviewsinthe A.C.E.wereidentifiedasnoninterviewsintheRevision codingwhenitwasdeterminedthatnoneoftheP-samplepeopletherewerevalidCensusDayresi-dents.2.RevisionE-andP-samplecaseswithunresolvedmatch,enumeration,orresidencystatus,becauseofincompleteorambiguousinterviewdatafromthePer-sonFollow-up(PFU)ortheEvaluationFollow-up(EFU).3.RevisionE-orP-samplecaseswithconflictingenu-merationorresidencystatusbecausecontradictoryinformationwascollectedinthePFUandtheEFUinter-viewsanditcouldnotbedeterminedwhichwasvalid.AGEIMPUTATIONFortheoriginalA.C.E.,P-samplepeoplewithmissingagewereassignedtoagecategoriesdefinedbythepost-stratificationplan.TheA.C.E.RevisionP-samplepost-stratificationdividedtheoriginalA.C.E.post-stratification groupof0-17yearoldsintotwoagegroups:0-9and10-17.Thosepeoplewithmissingagewhohadbeenassignedtothe0-17groupwerereassignedtoeitherthe 0-9orthe10-17group.Thisreassignmentassumedthattheagedistributionofpeoplemissingagewasuniformwithinthe0-17agegrouping.Otherpeoplewithunre-solvedageremainedintheagegrouptheyhadbeenorigi-nallyassignedto.HOUSEHOLDNONINTERVIEWADJUSTMENTTheA.C.E.householdnoninterviewadjustmentgenerallyspreadtheweightsofP-samplenoninterviewedhousingunitsoverinterviewedhousingunitsinthesameblock clusterwiththesamehousingunitstructuretype.Housing unitsweredeterminedtobenoninterviewsintwoways:1)aninterviewwasnotconductedduringtheA.C.E.per-soninterviewoperation,and2)basedontheresultsofthe A.C.E.PFU,itwasdeterminedthatawholehouseholdofP-Samplepeopleshouldnothavebeenlistedinthefirstplace,andthatanotherhouseholdmayhavebeenresi-dentsatthathousingunit.Separatehouseholdnoninter-viewadjustmentswereimplementedforCensusDayandA.C.E.InterviewDay.TheA.C.E.RevisionIInoninterviewadjustmentmethodol-ogyforA.C.E.InterviewDaywasessentiallyunchangedfromtheA.C.E.Therewas,however,animportantchangefromtheA.C.E.methodologyforthenoninterviewadjust-mentforCensusDayresidency.InA.C.E.RevisionII,anew imputationcellwasdefined.Itincludednewnoninter-viewsduetowholehouseholdsofA.C.E.nonmoverswhoweredeterminedtobeinmoversornonresidentoutmov-ersbytheRevisioncoding.ThenewnoninterviewcellSectionIIChapter44-1A.C.E.RevisionIIMissingDataMethodsU.S.CensusBureau,Census2000 spreadtheweightsofthesenoninterviewedunitsoverhousingunitswithatleastonepersonwho:1)indicated he/shelivedatanotheraddress,or2)wasidentifiedas potentiallyfictitiousintheA.C.E.Thesenewnoninter-viewswereassumedtohavebothalowmatchrateanda lowresidencyratesimilartothisgroup.Otherwise,the noninterviewadjustmentforCensusDayusedmethodol-ogysimilartothatoftheA.C.E.ASSIGNMENTOFPROBABILITIESOFCORRECTENUMERATION,CENSUSDAYRESIDENCY,ANDMATCHSTATUSIntheA.C.E.,P-samplepeoplewithunresolvedCensusDayresidencyormatchstatusoccurredinoneoftwoways.Firstly,theA.C.E.personinterviewmaynothaveprovided sufficientinformationformatchandfollow-up.Secondly,theA.C.E.PFUmaynothavecollectedadequateinforma-tionfordeterminingapersonsCensusDayresidencysta-tusortheirmatchstatus.InadequatedatacollectioncanalsoresultinunresolvedenumerationstatusesforA.C.E.E-samplepeople.IntheA.C.E.RevisionII,theEFUwasalso thesourceofunresolvedcases.Howacasewasimputeddependedonhowitbecameunresolved.ImputationforPeoplewithInsufficientInformationforMatchandFollow-UpTheRevisionP-samplepeoplewithinsufficientinformationformatchandfollow-uptendedtobethesamepeople whohadinsufficientinformationformatchandfollow-upintheA.C.E.,exceptforsomerarecaseswithcodingchanges.Notethatpeoplewhohadinsufficientinforma-tionintheA.C.E.werenotsenttoEFU.Therewereaboutthreemillionweightedpeoplewithinsufficientinforma-tionformatchandfollow-upinboththeFullandRevision Psamples.IntheA.C.E.,P-samplepeoplewithinsufficientinforma-tionformatchandfollow-upwereassignedaprobability ofCensusDayresidencyequaltotheresidencyrateofP-samplepeoplewhowenttoPFU.ThismethodologywasimprovedintheA.C.E.RevisionIIbydefiningfinerimputa-tioncellsthataccountedforwhetherornotthehousingunitwasmatched,nonmatched,orhadaconflictinghousehold.AconflictinghouseholdexistedwhentheP-andE-samplehouseholdshadnopeopleincommon.Theprobabilityofmatchwasassignedbasedontheover-allmatchrate,dividedintogroupsbasedonmoverstatus andhousingunitmatchstatus,aswasdoneintheA.C.E.,andadditionallyonconflictinghouseholdstatus.ImputationforPeoplewithIncompleteorAmbiguousFollow-UpIncontrasttoP-samplepeoplewithinsufficientinforma-tion,theresidencystatusforRevisionP-samplepeopleandthecorrectenumerationstatusforRevisionE-samplepeopleoftenchangedfromwhatitwasintheA.C.E.ThesestatuseschangedbecausetheRevisioncodingprocessed newinformationfromtheEFU,inadditiontotheoriginal informationfromthePFU.Thus,whiletheEFUinformation resolvedmanycasesthatwereunresolvedintheA.C.E. | |||
becauseofthePFU,EFUcaseswithincompleteorambigu-ousinformationwereanewsourceofunresolvedcases. | |||
TherewereaboutthesamenumberofweightedE-sample unresolvedcasesintheRevisionsampleasintheA.C.E., | |||
morethansixmillion,withabouthalfoftheserepresent-ingnewunresolvedcases.Incontrast,theRevisioncoding generatedsubstantiallymoreP-sampleunresolvedcases thantheA.C.E.,4.6millioncomparedto2.7million.This increasewasduetothefactthatallRevisionP-Sample cases(exceptthosewithinsufficientinformation)wentto EFU,includingwholehouseholdsofnonmatchedpeople whohadnotgonetoPFU.Thesepeoplewereassumedto beresolvedintheA.C.E.andcouldhavebecomeunre-solvedbecauseoftheEFU.TheoriginalA.C.E.missingdataplanbasedtheimputationcellsoninformationobtainedbeforeanyfollow-upwas conducted.AnadhocfixtotheA.C.E.missingdatameth-odologywasimplementedusinginformationfromthePFU.SeeCantwellandChilders(2001)fordetails.Based onthePFUkeyeddata,afterfollow-upgroupsforpoten-tialfictitiousandlivedelsewhereonCensusDaywerecreated.Thenewcellsusedinformationhighlyrelevantto residentorenumerationstatus.Further,theyshowedgreaterdiscriminationinassigningprobabilitiesofcorrectenumerationandresidency.InA.C.E.RevisionII,the beforefollow-upimputationcellswereabandonedandthe cellsweredefinedbasedonafterfollow-upinformation.ThischangewasthesinglemostimportantimprovementintheA.C.E.RevisionIImissingdatamethodology.Theafterfollow-upgroupdefinitionswerebasedonkeyedresponsestothePFUandEFUquestionnairecheckboxes andtheWhycodes.Whycodeswereclerically-appliedcodesthatreflectedresponsesinthequestionnairecheck-boxes,aswellashandwrittennotes.SeeAdamsand Krejsa(2002)foradetaileddescription.ThekeyedresultsandWhycodeshelpedidentifythefollowing:*unresolvedcaseswiththesamehistory,i.e.,therecipi-entcells.*resolvedfollow-upcaseswiththesamehistoryuptothepointofbeingunresolved,i.e.,thedonorpool.PFUafterfollow-upgroupsweredefinedforthosecasesthatwereunresolvedasaresultofthePFU.Similarly,EFUafterfollow-upgroupsweredefinedforthosecasesunresolvedbecauseoftheEFU.Itwasneces-sarytodefineseparategroupsforthePFUandEFU,becausetheirinterviewsandquestionnaireswerediffer-ent.However,thesameafterfollow-upgroupswere4-2SectionIIChapter4A.C.E.RevisionIIMissingDataMethodsU.S.CensusBureau,Census2000 employedfortheP-andE-sampleunresolvedcases,asthePFUandEFUquestionsaboutCensusDayresidencywere thesameasthePFUandEFUquestionsaboutenumeration | |||
status.Itisusefultodistinguishbetweenuninformativeandinfor-mativeunresolvedcases:*uninformativeunresolved:thefollow-upwasanoninter-vieworanincompleteinterview,thoughtherewasnoevidenceofanerroneousenumerationornonresident.*informativeunresolved:afollow-upinterviewwascon-ducted,andtherewasevidenceofanerroneousenu-merationornonresident.Notethatwhenoneinterviewwasuninformativeunre-solved,buttheotherinterviewwasresolved,theRevisioncodingselected(i.e.,thecodewasbasedon)theresolvedinterview.Ontheotherhand,whentheunresolvedinter-viewwasinformative,theRevisioncodingcouldchoosetheunresolvedinterviewovertheresolvedone.SeeAdamsandKrejsa(2002)fordetailsoftheRevisioncod- | |||
ing.ItoftenhappenedthatboththePFUandEFUinterviewswereunresolved.Toassignthiscasetoanimputationcell, theunresolvedinterviewthatwasmoreinformativewasselected.Whenbothinterviewshadthesamelevelofinformation,theEFUwastypicallyselectedoverthePFU, becausequestionsontheEFUquestionnaireweremore sharplydefined.Considerthefollowingexampleofanafterfollow-upgroup.OnecellofunresolvedE-samplepeopleorrecipi-entswasdefinedaspeoplewithevidencefromtheEFUinterviewthattheyhadmovedinsinceCensusDay,ormovedoutbeforeCensusDay,thoughtheEFUinterview didnotprovidetheaddresstheymovedtoorfrom.Itwasimpossibletodeterminetheenumerationstatusofthesepeople,sinceitwasuncleariftheirCensusDayaddress wasintheA.C.E.cluster.ThecorrespondingdonorpoolconsistedofthoseresolvedpeoplewhoindicatedintheEFUthattheyhadmovedinafterCensusDayormoved outbeforeCensusDay.Generally,thesepeopleprovided theirmoveraddressintheEFU.Ananalogousafterfollow-upgroupwasformedforpeopleunresolvedbecausetheyindicatedtheyweremoversinthePFUinterview.These groupsarecharacterizedasinformative,becausethefollow-upprovidedevidenceofanerroneousenumeration.Table4-1showsthenineEFUafterfollow-upgroups,whileTable4-2showstheninePFUafterfollow-upgroups.PeoplewhomovedinafterCensusDayormovedoutbeforeCensusDaywerethelargestinformativeafterfollow-upgroup.Anotherimportantinformativeafter follow-upgroupconsistedofpeoplewho,accordingtothe follow-up,hadanotherresidencesuchasavacationhome, thoughthefollow-updidnotindicatewhethertheother residenceorthesampleaddresswastheCensusDayresi-dence.Thenoninterviewgroupsanddidntanswerother residencequestionsgroupwerethelargeruninformative groups.Table4-1.EFUAfterFollow-upGroupsInformativegroupsThefolloweduppersonLivedelsewhereoratanotherresidence,buttheaddresswasnotgiven.FolloweduppersonmovedinafterCensusDayoroutbeforeCensusDay,butCensusDayaddressnotgiven.Respondentindicatedthefollowed-uppersonNeverlivedhereatthesampleaddress,butdidnotprovidetheCensusDayaddress.Thefollowed-uppersonhadanotherresidence,butdidnotindicatewhetherthesampleaddressorotherresidencewastheCensusDay | |||
residence.Followeduppersonmovedinormovedout,butnomovedatesgiven.UninformativegroupsTherespondentindicatedthefolloweduppersonLivedhereatthesampleresidence,butdidnotanswertheotherresidencequestion.Therespondentansweredthecurrentresidencequestion,butdidnotanswerthegroupquartersandotherresidencequestions.Therespondentdidnotanswertheusualresidencequestion,northegroupquartersandotherresidencequestions.Potentiallyfictitiousperson,norespondentsknewofthefollowedup person.SomeofthelargerEFUgroupsweresubdividedbyA.C.E.operationalvariables,suchaswhetherornotthehouse-holdwenttoPFU,orwhetherthehouseholdwasconflict-ing.Theuninformativeafterfollow-upgroupstendedtohaveimputedprobabilitiesofcorrectenumerationorresi-denceclosetoone,typicallyintherangeof0.92to0.99. | |||
Incontrast,theinformativeafterfollow-upgroupshadsmallerprobabilities,oftenlessthan0.25.Theprobabilityofcorrectenumerationiscalculatedastheweightedpro-portionofcorrectenumerationsinthedonorpool.For example,ProbabilityofcorrectenumerationWeightedCEsinDonorPoolWeightedResolvedEnumerationsinDonorPool | |||
.ForthePsample,probabilitiesofresidencyandmatchsta-tuswerecalculatedanalogously.SectionIIChapter44-3A.C.E.RevisionIIMissingDataMethodsU.S.CensusBureau,Census2000 Table4-2.PFUAfterFollow-upGroupsInformativegroupsThefolloweduppersonLivedelsewhereoratanotherresidence,buttheaddresswasnotgiven.FolloweduppersonmovedinafterCensusDayoroutbeforeCensusDay,butCensusDayaddresswasnotgiven.Therespondentindicatedthefolloweduppersondidnotlivehereatthesampleaddress,butdidnotindicatetheotheraddressanddidnot answerthegroupquartersandotherresidencequestions.Thefolloweduppersonhadanotherresidence,butdidnotindicatewheretheusualresidencewas.UninformativegroupsTherespondentindicatedthefolloweduppersonLivedhereatthesampleresidence,butdidnotanswertheotherresidencequestion.Therespondentansweredtheusualresidencequestion,butdidnotanswerthegroupquartersandotherresidencequestions.ThelivedherequestionisDontKnow/refused,andthegroupquartersandotherresidencequestionswerenotanswered.Blankquestionnaire.Potentiallyfictitiousperson,norespondentsknewofthefollowedup person.ImputationforConflictingCodingCasesWhentheA.C.E.EFUandPFUinterviewshadcontradictoryinformation,theRevisioncodingprocedureassignedthe caseaconflictingcode.Notethataconflictingcodeisdif-ferentthanaconflictinghousehold.Allconflictingcasesin theRevisioncodingprocessweresenttoanalystsforcleri-calreview.Byexaminingthehandwrittennotesofinter-viewers,analystscouldoftendeterminewhichofthetwo interviewswasbetterandassigntheappropriatecode. | |||
Thereweresomecaseswheretheinterviewsappearedto beofequalquality,suchaswhenbothrespondentswere householdmembersorbothrespondentswereproxiesof equalcaliber.Fortheseconflictingcases,theinterviews seemedequallylikelytobecorrectbasedontheanalysts expertise.Therefore,theprobabilityofcorrectenumera-tionforRevisionE-sampleconflictingcasesandtheprob-abilityofCensusDayresidencystatusforRevision P-sampleconflictingcaseswereassignedtobe0.5.It shouldbenotedthattheRevisioncodingresultedincon-siderablyfewerconflictingcasesthanthePFU/EFUReview Sample.AccordingtoAdamsandKrejsa(2001),the PFU/EFUReviewSamplehadabout2.6millionweighted conflictingpeopleincontrasttoonlyabout100,000 weightedconflictingpeopleintheRevisionSamples.4-4SectionIIChapter4A.C.E.RevisionIIMissingDataMethodsU.S.CensusBureau,Census2000 Chapter5.FurtherStudyofPersonDuplicationinCensus2000 INTRODUCTIONEvaluationsoftheMarch2001coverageestimatesindi-catedtheA.C.E.failedtodetectalargenumberoferrone-ouscensusenumerations.Onetypeofthesecensuserro-neousenumerationswasduplicatecensusenumerations; thatis,censusenumerationsincludedinthecensustwoormoretimes.TheA.C.E.wasnotspecificallydesignedtodetectduplicatecensusenumerationsbeyondthesearch area.However,therewasanexpectationthattheA.C.E.woulddetectthattheseE-sampleenumerationshadanotherresidence,andthat,roughlyhalfthetime thisotherresidencewastheusualresidence.Feldpausch(2001)showedthisexpectationwasnotmet.ForpurposesofconstructingA.C.E.RevisionIIestimates,matchingandmodelingtechniqueswereusedtoidentifyduplicatelinksbetweentheFullEandPsamplestocensus enumerations.Thematchingalgorithmusedstatistical matchingtoidentifylinkedrecords.Statisticalmatchingallowedforthematchingvariablesnottobeexactonbothrecordsbeingcompared.Becauselinkedrecordsmaynot refertothesameindividual,evenwhenthecharacteristicsusedtomatchtherecordsareidentical,modelingtech-niqueswereusedtoassignameasureofconfidence,the duplicateprobability,thatthetworecordsrefertothesameindividual.TheseduplicateprobabilitieswereusedintheA.C.E.RevisionIIestimates.Thischapterdocumentsthematchingandmodelingmeth-odsthatwereusedtoidentifyduplicatelinksandtopro-duceduplicateprobabilities.Notethatthisstudywasnotintendedtoidentifywhichenumerationwasinthecorrectlocation.Chapter6describeshowtocomputethecondi-tionalprobabilitythatthesamplecasewasinthecorrectlocationgiventhatithadalinktoacensusenumerationoutsidetheA.C.E.searcharea.Thiscalculationimpacts thecorrectenumerationstatusintheEsampleandtheresidencestatusinthePsample.AfulldiscussionoftheestimationcomponentsisgiveninChapter6. | |||
BACKGROUNDMule(2001)reportedresultsforinitialattemptsatmeasur-ingtheextentofpersonduplicationinCensus2000.Thisworkwasconductedbyaninter-divisionalgroupaspartofthefurtherresearchtoinformtheESCAPIIdecisionon adjustingcensusdataproducts.ThisstudyisreferredtoastheESCAPIIduplicatestudyinthischapter.TheESCAPIIduplicatestudyusedconservativecomputermatchingrulestominimizethenumberoffalsematchesthatcouldbeintroducedwhendoinganationwidesearch,sincetherewasnoclericalreviewoftheresults.Asaconse-quenceofthematchingrules,comparisonstobenchmarksindicatedthattheESCAPIIduplicateestimateswerealowerbound.Specifically,comparingtheESCAPIIresults withintheA.C.E.sampleareatotheA.C.E.clericalmatch-ingresultsshowedthatonly37.8percentofthecensusduplicateswereidentified.Fay(2001,2002)estimatedthe matchingefficiencyat75.7percentwhenaccountingfor thecensusrecordsout-of-scopefortheA.C.E.duplicatesearch.Theout-of-scoperecordswerethosethatwerereinstatedanddeletedfromtheHousingUnitDuplication Operation,documentedinNash(2000).TheESCAPIImatchingwasatwo-stepprocess.First,thesampleofcensusrecordswerematchedtothefullcensus onfirstname,lastname,monthofbirth,dayofbirthandcomputedage.Agewasallowedtovarybyoneyear.Middleinitialsandsuffixesbeingscannedintothefirst namefieldwereaccountedfor;however,theothercharac-teristicshadtobeexactmatchesatthisstage.Thisfirst-stagematchestablishedalinkbetweenhouseholds.Inthesecondstage,allpersonrecordsinthelinkedhouseholds fromthefirststagewerestatisticallymatchedusingfirstname,middleinitial,lastname,monthofbirth,dayofbirth,andcomputedage.Thematchingparametersused inthestatisticalmatchingwereborrowedfromotherCen-sus2000matchingoperations.Mule(2001)describesthismatchingalgorithminmoredetail.Toreducetheimpactoffalsematches,particularlywithrespecttopersonswithcommonnamesandthesamemonthanddayofbirth,modelweightswereappliedtoeachsetoflinkedrecordsasameasureofconfidencethat thelinkedrecordswereindeedduplicates.Duetosched-uleconstraints,anational,Poissonmodelwasusedinlieuofaprobabilitymodel.TheESCAPIIcensusduplicatemethodologysatisfiedtheintendedprojectgoalsandprovidedavaluableevaluationofthecensusbyshowingthatpersonduplicationexisted.However,limitationsofthemethodologymadeitdifficult toestimatethemagnitudeofpersonduplicationinthe census.OVERVIEWOFTHEDUPLICATESTUDYPLANLiketheESCAPIIstudy,theA.C.E.RevisionIIduplicateplaninvolvedmatchingtheFullEandPsamplestothecensustoestablishpotentialduplicatelinks.Then,model-ingtechniqueswereusedtoidentifythelinksmostlikelySectionIIChapter55-1FurtherStudyofPersonDuplicationinCensus2000U.S.CensusBureau,Census2000 tobeduplicateenumerationsandtoassignameasureofconfidencethatthelinksareduplicates.Keydifferences withtheESCAPIIstudyincludeextendingtheuseofsta-tisticalmatchinganddevelopingmodelstoassignadupli-cateprobabilitytothelinks.Anadvantageofduplicate probabilitiesoverthePoissonmodelweightsusedinthe ESCAPIIstudyisthatallduplicatelinksoutsidetheA.C.E. | |||
searchareacouldbereflectedintheA.C.E.RevisionIIesti-mates.Fay(2001,2002)usedasubsetoftheESCAPII duplicatelinkstoproducealowerboundonthelevelof erroneousenumerationsthattheA.C.E.didnotmeasure.EstimatesofcensusduplicationwerebasedonmatchingandmodelingE-samplecasestothecensus.Forpurposes ofA.C.E.RevisionIIestimation,thePsamplewasalsomatchedtothecensus.However,theseresultsdidnotcontributetoestimatesofpersonduplicationinthecen-sus.TheA.C.E.RevisionIIestimationmethodologyadjustedtheA.C.E.correctenumerationrateforE-sample caseswithlinksoutsidetheA.C.E.searcharea.Further,theA.C.E.RevisionIIestimationmethodologyadjustedtheA.C.E.matchrateforP-samplecasesthatlinkedtocensuscasesoutsidethesearcharea.Thematchingalgorithmconsistedoftwostages.ThefirststagewasanationalmatchofpersonsusingstatisticalmatchingasdescribedinWinkler(1995).Statisticalmatch-ingattemptedtolinkrecordsbasedonsimilarcharacteris-ticsorcloseagreementofcharacteristics.Exactmatchingrequiredexactagreementofcharacteristics.Statistical matchingallowedtworecordstolinkinthepresenceofmissingdataandtypographicalorscanningerrors.Sixcharacteristicscommontobothfiles,calledmatchingvariables,wereusedtolinkrecordsintheFullEandPsamplewithrecordsinthecensus.Matchingparametersassociatedwitheachmatchingvariablewereusedtomea-surethedegreetowhichthematchingvariablesagreedbetweenthetworecords,rangingfromfullagreementto fulldisagreement.Themeasurementofthedegreetowhicheachmatchingvariableagreedwascalledthevari-ablematchscore.Theoverallmatchscoreforthelinkedrecordswasthesumofthevariablematchscores.Fullagreementofatleastfourcharacteristicswasrequiredtobeconsideredaduplicatelink.Becausethisstudywasacomputerprocesswithoutthebenefitofaclericalreview,thislimitationofthestatisticalmatchingwasnecessaryin ordertominimizelinkingrecordswithsimilarcharacteris-ticsthatrepresenteddifferentpeople.Thiswasaparticu-larconcernwhenlookingforduplicateenumerations acrosstheentirecountry.Theneedtousestatistical matchingatthefirststagewasapparentafterthelimitedsuccessoftheESCAPIIexactmatchingproceduretoiden-tifyA.C.E.duplicatesintheA.C.E.sampleareas.Thesta-tisticalmatchingyieldedbetteridentificationoftheA.C.E.duplicates,buttoidentifyalloftheA.C.E.duplicateswouldhaverequiredfewercharacteristicstobeexact matches.Thiscouldpotentiallyleadtoahighnumberof falselinks.ThesearchforduplicatelinksbetweentheFullEandPsamplesandthecensuswaslimitedtothosepairsthatagreedoncertainidentifiers,orblockingcriteria.Blockingcriteriaweresortkeysthatwereusedtoincreasethecom-puterprocessingefficiencybysearchingforlinkswheretheyweremostlikelytobefound.Forinstance,tosearchonlyforduplicateswhenthefirstandlastnamesagreed, boththesampleandcensusfileswouldhavebeensorted bytheblockingcriteriaoffirstandlastname.Then,allpossiblepairswithineachfirstname/lastnamecombina-tionwouldhavebeensearchedforduplicatelinks. | |||
Althoughtruematchescanbemissedbyusingblockingcriteria,multiplesetsofblockingcriteriaminimizethenumberofmissedmatches.TheA.C.E.RevisionIIdupli-catestudyutilizedfoursetsofblockingcriteria.Atthefirststageofmatching,itwaspossibleforonesamplecasetolinktomultiplecensusrecords.Allofthese linkswereretainedforthesecondstageofmatching.Thesecondstageofmatchingwaslimitedtomatchingpersonswithinhouseholds.IfanE-orP-samplecase linkedtoacensusrecordinagroupquarter,thecasedidnotgotothesecondstage.Usingresultsfromthefirststageofmatching,alinkbetweentwohousingunitswas established.Thesecondstagewasastatisticalmatchofallhouseholdmembersinthesamplehousingunittoallhouseholdmembersinthecensushousingunit.The second-stagematchingvariableswerethesameasthe firststage;however,thematchingparametersdiffered.Usingasubsetofthefirst-stagelinks,thesecond-stagematchingparameterswerederivedusingtheExpectation-Maximization(EM)algorithm.SeeWinkler(1995)foramoredetailedexplanation.Akeydifferencebetweenthefirst-andsecond-stageparameterswasthereduced emphasisonrequiringlastnamestoagreeinthesecondstage.Thisintuitivelymakessense,sincesecondstagematchingwaswithinagivenhousehold.Thehouseholdwastheonlysetofblockingcriteriausedatthesecondstageofmatching.Samplerecordswereallowedtolinktoonlyonecensusrecordwithinthe household.Asaconsequence,thislimitedtheabilityto identifywithin-householdduplicatelinks.Eachlinkhadanoverallmatchscorebasedonthesecond-stagematching.Thesetoflinkedrecordsfromthesecond-stagematchingandthelinkstogroupquarterenumerationsfromthefirststageconsistedofbothduplicateenumerationsandper-sonrecordswithcommoncharacteristics.Twomodeling approacheswereusedtoestimatetheprobabilitythatthelinkedrecordswereduplicates.Oneapproachusedtheresultsofthestatisticalmatchingandreliedonthe strengthofmultiplelinkswithinthehouseholdtoindicate5-2SectionIIChapter5FurtherStudyofPersonDuplicationinCensus2000U.S.CensusBureau,Census2000 personduplication.Thesecondreliedonanexactmatchofthecensustoitselfandthedistributionofbirths, namesandpopulationsizetoindicateiftheindividuallink wasaduplicate.Thesetwoapproacheswerereferredtoas thestatisticalmatchmodelingandtheexactmatchmodel-ing,respectively.Thesetwoapproacheswerecombinedso thateachsamplecasewithalinktoacensusenumerationhadanestimatedprobabilityofbeingaduplicate.Thestatisticalmatchmodelingwasusedwhentwoormoreduplicatelinkswerefoundbetweenhousingunitsin thesecondstage.Afterthesecond-stagematching,eachduplicatelinkbetweenasamplehouseholdandcensushouseholdhadanoverallmatchscore.So,foreachsample household,asetofmatchscoreswasobserved.Foranyresultingsetofmatchscores,aprobabilityofnotobserv-ingthissetofmatchscoreswasestimated.Seetheattach-mentfordetails.Thehigherthisprobability,themorelikelythatthesetoflinkedrecordsinthehouseholdwere duplicates.Theestimateoftheprobabilityofnotobservingthissetofmatchscoresassumedindependenceoftheindividualmatchscoreswithineachhousehold.Thisassumptionwas basedonusingtheEMalgorithmtodeterminethesecond-stagematchingparameters.Theprobabilityofobservingtheindividualmatchscoreswasestimatedfromtheempiricaldistributionofindividualmatchscoresresulting fromthesecond-stagematching.Further,thismeasureaccountedforthenumberoftimesthatauniquesamplehouseholdwasmatchedtodifferentcensushouseholds withinagivenlevelofgeography.Theprobabilityofnot observingthissetofmatchscoreswastranslatedintoastatisticalmatchduplicateprobabilityof0or1basedoncriticalvaluesthatvariedbylevelofgeography.Theexactmatchmodelingreliedonanexactmatchofthecensustoitself.Themethodologyaccountedfortheover-alldistributionofbirths,frequencyofnames,andpopula-tionsizeinaspecificgeographicarea.Duplicateprobabili-tieswerecomputedseparatelybygeographicaldistanceofthelinks.Further,duplicatelinksweremodeledseparately byhowcommonthelastnamewas,aswellasforHis-panicnames.Thetwoapproacheswerecombinedtoassignanesti-matedprobabilitythatthelinkedrecordswereduplicates. | |||
Theduplicateprobabilityforthelinkstogroupquartersinthefirststageandone-personhouseholdlinkswerefromtheexactmatchmodeling.Forallotherlinks,thedupli-cateprobabilitywasthelargerofthetwomodelesti-mates.Fornonexactmatches,thiswasalwaysfromthestatisticalmatchmodeling.Forexactmatches,adjust-mentsweremadetoaccountfortheintegrationofthese twomethods.Basedontheresultsofthismatchingandmodeling,anoverallestimateofcensusduplicateswasderivedfromthe E-samplelinks.Further,foreachFullE-andP-sampleper-sonwholinkedoutsidetheA.C.E.searcharea,these resultsprovidedtheprobabilitythattheywereinfactthe sameperson.Theseduplicateprobabilitieswereusedin theA.C.E.RevisionIIestimates.MATCHINGALGORITHMEffortstoincreasematchingefficiencyovertheESCAPIIduplicatestudyincludedimplementingstatisticalmatch-ingofpersonsatthefirststageandtheuseofmoredis-criminatingmatchingparametersatthesecondstage. | |||
InputsBoththeFullEandPsampleswerematchedtothecensusrecords.TheE-samplerecordsreflectedanyupdatesmadebytheclericalstaffduringtheA.C.E.matchingoperation whenthecensuscharacteristicswereincorrectlytran-scribedorscanned.ThePsampleincludedallnonmovers,outmovers,andinmovers.ThesamematchingalgorithmwasusedfortheFullEandPsamples.Thecensusfilesconsistedofdata-definedpersonrecordsforboththehouseholdandgroupquarterspopulations. | |||
BoththereinstatedanddeletedrecordsfromtheHousingUnitDuplicationOperationdescribedinNash(2000)wereincludedinthematching,sotheselinkscouldbereflected intheA.C.E.RevisionIIestimates.FirstStage:Person-LevelMatchingThefirststagewasastatisticalmatchoftheFullEandPsamplestothecensus.Thiswasanationalmatchwhere eachFullsamplecasewascomparedwithcensusrecordsacrossthenationtoassesshowwellthematchingvari-ablesagreed.Thematchingvariableswerefirstname,lastname,middleinitial,monthofbirth,dayofbirth,andcomputedage.ThematchingvariablesandparametersaregiveninTable5-1.Theagreementweightandthedisagreementweight arethematchingparametersofeachvariable.Standardmatchingparameterswereusedatthefirststage.Therelationshipoftheagreementanddisagreementparam-eterstranslatedintothematchscoreforeachvariable.For example,thefullagreementvalueforfirstnamewas2.1972;whereas,thefulldisagreementmatchscorewas-2.1972.Thesumofthevariablematchscoreswasthe totalmatchscore.Whenthematchscorewas9.4006,thisindicatedfullagreementofallvariables.Amatchscoreof-9.4006,ontheotherhand,indicatedfulldisagreement.SectionIIChapter55-3FurtherStudyofPersonDuplicationinCensus2000U.S.CensusBureau,Census2000 Table5-1.First-StageMatchingParametersMatchingvariablesTypeof comparisonMatchingparametersMatchscore Agreementweight(m)Disagreementweight(u)Agreementln(m/u)Disagreementln((1-m)/(1-u))FirstnameString(uo)0.90.12.1972-2.1972LastnameString(uo)0.90.12.1972-2.1972MiddleinitialExact0.70.30.8473-0.8473MonthofbirthExact0.80.21.3863-1.3863DayofbirthExact0.80.21.3863-1.3863ComputedageAge(p)0.80.21.3863-1.3863Total9.4006-9.4006Thetypeofcomparisonindicatedthestatisticalmatchingmethodforcomparingthevariables.Forexample,thestringcomparitorwasusedforfirstnameandlastname.Thismethodaddressedtypographicalerrorsinnames.For example,TimandTumcanyieldapositiveagreementscore.Anexactmatchalgorithmwouldhavetreatedtheseasadisagreement.Forage,theagevaluescouldhave beenoffby+/-oneyearandstillreceiveafullagreementscoreoncomputedage.TheStatisticalResearchDivisionmatchingsoftwarecalledBigMatchdocumentedinYancey(2002)wasusedinthe firststage.Thissoftwareallowedasamplerecordtolink tomorethanonecensusrecord.Thiscapabilitywasimportant,sinceitwaspossiblefortheretobemorethantwoenumerationsofthesamepersoninthecensus.Fourblockingcriteriawereused.Blockingrestrictedthecomparisonsofrecordstoonlythosethatexactlyagreedoncertainvalues.Mostrecordsthatdidnotagreeonthevaluesbelowareprobablynotduplicates.Theblocking criteriawere:*Firstname,lastname*Firstname,firstinitialoflastname,agegroupings(0-9,10-19,20-29,etc.)*Lastname,firstinitialoffirstname,agegroupings(0-9,10-19,20-29,etc.)*Firstinitialoffirstname,firstinitialoflastname,monthofbirth,dayofbirthAllpossiblelinkswithineachblockingcriteriawerecom-pared.Foreachcomparison,thevariablematchscoreandthetotalmatchscorewerecomputed.Thefirst-stagematchingdecisionruleswereasfollows.First,amatch musthavehadatleastfourofthematchvariablesinfull agreement.ThismeantthatfourofthevariableshadtohaveamatchscoreequaltotheagreementmatchscoreinTable5-1.Theoneexceptionwasthemiddleinitial.When themiddleinitialwasblank,itwasconsideredtobeinfullagreementinthisstudysincethemiddleinitialwasoftenmissingonthesampleandcensusrecords.Inthiscase,themiddleinitialscorewaszero.Second,thetotalmatchscorehadtobe4.7orgreater.Thisminimumscorewasabouthalfthetotalscoreforfullagreementofallmatch-ingvariables.Table5-2showsthedistributionofA.C.E.linkswithinclusterthatwereidentifiedbytheresultingnumberofmatchingvariablesinfullagreement.Therewereatotalof 10,559duplicatelinksidentifiedbytheA.C.E.clericalstaffthatagreedonthefirstletterofthefirstandlastname.ThetableshowsthenumberofidentifiedA.C.E.duplicates asthenumberofmatchingvariablesinfullagreement decreased.Thetablealsodisplaysthenumberoftotallinksthatwereidentified.ThepercentofA.C.E.linksineachrowofthetabledecreasesasthenumberofmatch-ingvariablesinfullagreementdecreases.Byrequiringatleastfourmatchingvariablestobeinfullagreement,68.4percentoftheseA.C.E.duplicateswere identified.Ontheotherhand,whenonlyfourofthesixvariablesfullyagreed,only30.4percentofthetotallinksidentifiedbythiscriteriawereA.C.E.RevisionIIduplicates. | |||
Notethatitwastemptingtorequirethatonlythreevari-ablesbeinfullagreement,sincethiswouldincreasethenumberofA.C.E.duplicatesby20percent.However,thischangewouldsubstantiallyincreasethenumberoffalse | |||
matches.Table5-3showsthatintroducingaminimumtotalscoregreatlyincreasedthedensityofA.C.E.linksidentified. | |||
NotethatsomeA.C.E.duplicatelinksweredroppedbyusingthiscriteria.Thiswasaconsequenceofapplyingrulesthatreducedthefalselinkrate.SecondStage:Household-LevelMatchingThesecondstageofmatchingwasrestrictedtothehouse-holdpopulation.Thepersonlinksfromthefirststageestablishedalinkbetweentwohousingunits.Thesecondstagewasastatisticalmatchofthehouseholdmembers fromthetwohousingunits.Asamplehouseholdwasincludedinthesecondstagemultipletimes,ifthesample-householdhadpersonswithlinkstomultiplecensus householdsinthefirststage.Thiswasthesameapproach usedfortheESCAPIIduplicatestudy.5-4SectionIIChapter5FurtherStudyofPersonDuplicationinCensus2000U.S.CensusBureau,Census2000 Table5-2.DistributionofLinksWithinA.C.E.ClustersbyFullAgreement[Percentagesmaynotaddduetorounding]NumberofvariablesinfullagreementA.C.E.linksTotal linksPercentofA.C.E.linksinrowCountPercentCumulativepercent62,34822.222.22,45195.852,89527.449.73,98372.7 41,98318.868.46,52030.432,21120.989.440,8915.429549.098.4180,3240.5 11641.699.9601,370<0.1 04<0.1100350,987<0.1Total10,5591001001,186,5260.9Table5-3.DistributionofA.C.E.andTotalLinksWithinA.C.E.Clusters[Onlyincludelinkswithscore4.7]NumberofvariablesinfullagreementA.C.E.linksTotallinksPercentofA.C.E.linksinrow62,3482,45195.852,8683,76376.2 41,6802,67062.9300n/a200n/a100n/a 000n/aTotal6,8968,88477.6Table5-4.Second-StageMatchingParametersMatchingvariablesTypeof comparisonMatchingparametersMatchscore Agreementweight(m)Disagreementweight(u)Agreementln(m/u)Disagreementln((1-m)/(1-u))FirstnameString(uo)0.95000.01254.3307-2.9832LastnameString(uo)0.96000.57000.5213-2.3749MiddleinitialExact0.08400.02201.3398-0.0655MonthofbirthExact0.60000.06002.3026-0.8544DayofbirthExact0.30000.02002.7081-0.3365ComputedageAge(p)0.97500.13251.9959-3.5467Total13.1984-4.1948Thematchingvariableswerethesameasthefirststage:firstname,lastname,middleinitial,monthofbirth,dayofbirth,andage.Table5-4givesthematchingparameters.Thedatainthistablehavesimilarmeaningasthatforthe firststageparametersinTable5-1.Usingasubsetofthefirst-stagelinks,thesecond-stagematchingparameterswerederivedusingtheEMalgorithmasdescribedinWin-kler(1995).Theseparameterswereanticipatedtobemore discriminatingthanthesetusedfortheESCAPIIstudy.Sincethefirst-stagematchingestablishedalinkbetweentwohousingunits,firstnamehadmorediscriminating powerthanlastnameinthesecondstage.Whenfirstnamefullyagreed,itcontributed4.3307towardthetotalscore,whilelastnameonlycontributed0.5213whenitwasinfullagreement.Further,monthofbirthanddayofbirthweremorepowerfulthancomputedage.Thiswas expectedsinceadultsinahousingunitoftenhavesimilarages,butnotthesamemonthanddayofbirth.TheStatisticalResearchDivisionRecordLinkagesoftwaredescribedinWinkler(1999)wasusedforthesecondstage.Eachsamplerecordwaslinkedtoonlyonecensusrecordwithinthehousehold,aone-to-onematching.There wasnoadditionalblockingcriteriabeyondhousehold;allpossiblelinkswithinhouseholdswerecompared.Eachlinkhadatotalmatchscorerangingfrom-4.1948to13.1984. | |||
Thissecond-stagematchscorewasusedforthemodeling.SectionIIChapter55-5FurtherStudyofPersonDuplicationinCensus2000U.S.CensusBureau,Census2000 Alllinkswithasecond-stagematchscoregreaterthan0.3419wereretainedasinputtothemodeling.ReverseNameMatchingOccasionally,firstandlastnamewascapturedinreverseorderonthedatafiles.Thefirstnamewasinthelastnamefieldandthelastnamewasinthefirstnamefield. | |||
Whenthedatawasinreverse-orderononefilebutnottheother,itwasdifficulttoidentifytheseduplicatelinkssincethevariablematchscoresforfirstandlastnamedisagreed forboththefirstandsecondstage.Toattempttoidentify thesecases,thefirstandlastnamefieldswerereversedandthenmatchedtothecensusfilesasecondtime.Theduplicatelinksfrombothruns,nameintheusualorder andinreverseorder,wereinputtothemodeling.Whenbothmethodsidentifiedthesameduplicatelink,thehigherofthetwomatchscoreswasretainedandusedin themodeling.MODELINGLINKSSincethegoalofthisstudywastoprovideduplicateinfor-mationforcalculatingA.C.E.RevisionIIestimates,itwas importanttoprovideameasureofconfidencethatcould beincorporatedintotheestimationmethodology.Conse-quently,modelingeffortsfocusedonmethodsforestimat-ingtheprobabilitythattwolinkedrecordswereduplicate enumerations.AnadvantageofduplicateprobabilitiesoverthePoissonmodelweightsusedinESCAPIIwasthatallduplicatelinksoutsidetheA.C.E.searchareacouldbe reflectedintheA.C.E.RevisionIIestimates.Thestatisticalandexactmatchmodelingapproacheswerecombinedtoyieldanestimatedduplicateprobabilityforthelinked recordsfromthestatisticalmatchingoftheEandP samplestothecensus.StatisticalMatchProbabilityThestatisticalmatchmodelingwasusedwhenthesecondstagematchingresultedintwoormoreduplicatelinks.Afterthesecond-stagematching,eachduplicatelink betweenasamplehouseholdandcensushouseholdhadanoverallmatchscore.So,foreachsamplehousingunittocensushousingunitmatch,asetofmatchscoreswas observed.Foranyresultingsetofmatchscores,aprob-abilityofnotobservingthissetofmatchscores,Pr(NT),wasestimatedforeachlinkwithinthesamplehousehold. | |||
Thehigherthisprobability,themorelikelythatthesetof linkedrecordsinthehouseholdwereduplicates.Sinceasamplehousingunitcouldhavebeenmatchedtomorethanonecensushousingunitduringthesecond stage,thereweremultiplesetsofduplicatelinksand matchscoresforeachsamplehousingunit.EachsetofduplicatelinksforasamplehousingunitwasassignedaseparatePr(NT)sincethematchscoresdifferforeachmatchingattempt.Further,thePr(NT)foreachsetofdupli-catelinksforasamplehousingunitvariedbecauseofthe geographicdistanceoftheduplicatelinks.Asshownin theattachment,Pr(NT)wasestimatedby P rNT[1pd1 P rX dx d]nwhere Pr(X dx d)wastheprobabilityofgettingatotalmatchscoreX dgreaterthanorequaltox d , pwasthenumberofduplicatelinksinthesamplehousehold, and nisthenumberofcensushousingunitsthesamplehouseholdwasmatchedwithinthesecondstagewithinagivengeo-graphicarea.Theestimateoftheprobabilityofnotobservingthissetofmatchscoresassumedindependenceoftheindividual matchscoreswithineachhousehold.ThisassumptionwasbasedonusingtheEMalgorithmtodeterminethesecond-stagematchingparameters.Theprobabilityofobserving theindividualmatchscoreswasestimatedfromtheempiricaldistributionofindividualmatchscoresresultingfromtheentiresecond-stagematching.Further,thismea-sureaccountedforthenumberoftimesthataunique samplehouseholdwasmatchedtodifferentcensushouse-holdswithinagivenlevelofgeography.Thegeographicallevelswereblock,tract,samecounty(outsidetract),same state(outsidecounty),anddifferentstate.FortheEsample,thisanalysiswasdoneattheE-samplehouseholdlevel.ForthePsample,ahouseholdconsisted ofanycombinationofnonmovers,outmovers,andinmov-ers.Toaccountforthis,theduplicatelinkswereanalyzedseparatelybymoverstatuswhenlookingatpatternsof matchscores.Theprobabilityofnotobservingthissetofmatchscoreswastranslatedintoastatisticalmatchduplicateprob-abilityof0or1basedoncriticalvaluesthatvariedby levelofgeography.Table5-5showstheminimumvalueofPr(NT)forassigningastatisticalmatchduplicateprobabil-ityof1forEandPsamples.Table5-5.MinimumValueforAssigningStatisticalMatchProbabilityGeographicdistanceoflinkedrecordsMinimumPr(NT)EsamplePsampleSameblock...........................0.000.25Sametract(differentblock). | |||
............0.700.35Samecounty(differenttract) | |||
............0.970.60Samestate(differentcounty) | |||
...........0.970.60Differentstate.. | |||
.......................0.970.605-6SectionIIChapter5FurtherStudyofPersonDuplicationinCensus2000U.S.CensusBureau,Census2000 DuplicatelinkswithaPr(NT)greaterthanorequaltotheminimumvalueinTable5-5wereassignedastatistical matchduplicateprobabilityof1.Allotherlinkswere assignedastatisticalmatchduplicateprobabilityof0.ExactMatchProbabilityGivenexactmatchingofthecensustoitself,duplicateprobabilitieswereassignedtolinkedrecordsbytakingintoaccounttheoveralldistributionofbirths,frequencyof names,andpopulationsizeinaspecificgeographicarea.Duplicateprobabilitieswerecomputedseparatelybylinkswithincounty,linkswithinstateanddifferentcounty,and differentstates.Further,duplicatelinksweremodeledseparatelybyhowcommonthelastnamewas,aswellasforHispanicnames.Fay(2002b)givesthemodelandpre-liminaryresults.ThefollowingareexcerptsfromFay(2002b)togivethereaderageneralideaoftheapproach.LikethePoissonmodel,thenewapproachusesfrequen-ciesofoccurrencesofcombinationsoffirstandlastname.Theresultisanestimatedprobabilityofduplicationformostmatches,exceptformatchesoffrequentlyoccurring names,wheretheprobabilityofduplicationislowanddif-ficulttoestimatewithhighrelativeprecision.Thisworkresultsinaseriesofprobabilitymodels,withparametersthatcanbeestimatedstatisticallyfromobservedcensusdata.Acoremodelcharacterizesprob-abilitiesofduplication,tripleenumeration(apparentenu-merationofthesamepersonthreetimes),andotherforms ofmultipleenumerationwithinagivengeographicarea.Theothermodelsaccountforduplicationacrossdomain.Thefirstpartofthecoremodelexpressestheprobabilityofcoincidentallysharingabirthday.Asecondsetofexpressions,amodelforcensusduplication,isbuiltontopofthemodelforcoincidentalsharingofdateofbirth. | |||
Thecoremodelcombinesthetwomodelstoaccountforobservedpatternsofexactcomputermatchesofcensusenumerations.Thecoremodelprovidesabasistoesti-mateaprobabilitythatagivencomputermatchlinksthesamepersoninsteadoftwopersonscoincidentallysharingabirthday.Anapproximateargumentallowsthecore modeltobeextendedtonestedgeographiccategories, suchas(1)counties,(2)othercountieswithinstate,and(3)otherstates.Theresultoftheexactmatchmodelisaduplicateprob-abilitygreaterthanorequaltozero,butlessthanoneforcensusrecordsthatagreeexactlyonfirstname,lastname,monthanddayofbirthandtwo-yearageintervals.CombiningtheTwoModelsThetwoapproacheswerecombinedtogiveoneduplicateprobabilitytoeachE-andP-sampleduplicatelink.Table 5-6summarizestheresultsofcombiningthetwomodels.Theduplicateprobabilityforthelinkstogroupquartersinthefirststageandone-personhouseholdlinkswerefromtheexactmatchmodeling.Forallotherlinks,thedupli-cateprobabilitywasthelargerofthetwomodelesti-mates,asindicatedbytheshadedcellsinTable5-6.For nonexactmatches,theduplicateprobabilityassignment wasalwaysbasedonthestatisticalmatchmodeling.Forexactmatchesinsamplehouseholdswithtwoormorepersons,adjustmentsweremadetoaccountfortheinte-grationofthesetwomethods.Theexactmatchprobabili-tiesweredeterminedconditionally,requiringadownwardadjustmentoftheexactprobabilitiesforthelinks,which thestatisticalmatchmodelingassignedaprobabilityofzero.Theamountofthedownwardadjustmentwasbasedontheupwardadjustmentmadewhenusingthestatisticalmatchprobabilityofoneinsteadoftheexactmatchprob-ability.Table5-6.CombiningtheTwoModeling ResultsTypeofLinkSizeof sample HUTypeof match Statistical match probability Exact match probabilityHousingUnit1Exact-[0,1)Nonexact--2+Exact1[0,1)Exact0[0,1)Nonexact1-Nonexact0-GroupQuarterExact-[0,1)Nonexact---Modelingdidnotassignavalue.Theresultsofthismodelingprovided,foreachFullE-andP-samplepersonwholinkedtoacensuspersonoutside theA.C.E.searcharea,theprobabilitythattheywereinfactthesameperson.Theseprobabilities,referredtoasp tinChapter6,wereusedtoobtainA.C.E.RevisionIIesti-mates.ReinstatedandDeletedCensusRecordsFortheexactmatchmodeling,separateprobabilitieswerecomputedbasedonpopulationdistributionswithand withoutthereinstatedanddeletedrecords.Onecomputedduplicateprobabilityallowedsamplerecordstolinktoreinstatedanddeletedcensusrecords,whileasecond duplicateprobabilitydidnotallowlinkstoreinstatedand deletedrecords.Underthissecondscenario,anylinkstoreinstatedordeletedrecordswereassignedaduplicateprobabilityofzero.Theduplicateprobabilitiesusedinthe A.C.E.RevisionIIestimationwerethosethatallowedlinkstoreinstatedanddeletedcensusrecords.ASSESSMENTOFLINKSThroughoutthedevelopmentoftheFurtherStudyofPer-sonDuplicationinCensus2000,theA.C.E.duplicatelinksfoundduringproductionwerethebenchmarkusedtoSectionIIChapter55-7FurtherStudyofPersonDuplicationinCensus2000U.S.CensusBureau,Census2000 gaugewhetherthematchingalgorithmdidagoodjoboffindingtrueduplicatesandminimizingthenumberof falselinksfoundwithintheblockcluster.ThisstudyandtheESCAPIIduplicatestudydocumentedinFay(2001,2002)utilizedthesamemethodforestimating efficiencyfortheEsample.Basically,themethodesti-matedtheeffectivenessofidentifyingA.C.E.clericaldupli-cateswithintheA.C.E.sampleareaandaccountedfor duplicatelinkstoreinstatedanddeletedrecordsthatwereout-of-scopeforA.C.E.Insteadofproducingoneoverallefficiencymeasure,severalmeasureswerecomputedfor variouslevelsofdetailincludingsizeofsamplehouseholdandnumberoflinksbetweentheunits.FORMINGESTIMATESOFDUPLICATESEstimatesofcensusduplicateswereformedbysummingtheproductofthesamplingweightfortheE-sampleper-son,theduplicateprobability,andthemultiplicityfactor. | |||
Sinceasampleofthecensus(Esample)wasmatchedto thecensus,anaiveapproachwouldtreateachduplicate linkofAtoBasoneduplicate.However,hadadifferent samplebeendrawn,itcouldhavecontainedtheBtoA link.Applyingamultiplicityfactorof1/2inthissimplecaseensuredthattheestimateofthisexamplewasonly oneduplicate.SeeMule(2002b)formoredetailsonthecomputationofthemultiplicityfactor.5-8SectionIIChapter5FurtherStudyofPersonDuplicationinCensus2000U.S.CensusBureau,Census2000 Attachment.ProbabilityofNotObservingaSetofMatchScoresEachE-samplehouseholdhadasetofduplicatelinkstoaparticularcensushousehold.Eachduplicatelinkhadacor-respondingoverallmatchscorefromthesecond-stagematchingresultinginapatternofmatchscoresforthe samplehousehold.Thetaskwastoassesswhetherthisobservedsetofmatchscoresoccurredbecausethelinkswereduplicatesorbecausetherecordshadcharacteristics incommonbutweredifferentpeople.Objective:Toestimatetheprobabilityofnotobservingthissetofmatchscoresorbetterforeach E-samplehousehold.Thehypothesisisthatthehighertheprobabilityofnotobservingthissetofmatchscoresorbetter,themore likelythelinksrepresentduplicateenumerations.SupposeaparticularE-samplehouseholdhasp2dupli-catelinkswithobservedmatchscoresx 1 ,x 2,...,x p.DefinePr(NT)tobetheprobabilityofnotobservingthesetofmatchscoresorbetter, (X 1x 1 , X 2x 2,..., X px p.Thisprobabilitycanbeexpressedas P rNT[1P rX 1x 1 ,X 2x 2,...., X px p]n (1)wherenwasthenumberofdifferentcensushousingunitsthattheE-samplehousingunitwaslinkedtoduringthesecond-stagematch.ThiscalculationaccountedforthefactthatthemoretimestheE-samplehousingunitmatchedtodifferenthousingunits,thegreaterthechance ofobtainingthisoutcome.IndividualmatchscoresX 1 ,X 2,...,X pwereassumedtobeindependent,sincethesecond-stagematchingparametersgavemoreemphasistofirstnameratherthanlastname.Further,theparametersgavemoreemphasistomonthand dayofbirthratherthanage.Undertheindependence assumption,(1)canbewrittenasfollows: | |||
P rNT[1pd1 P rX dx d]n (2)TheprobabilityofobservingamatchscoreX dgreaterthanorequaltox d,Pr(X dx d),wasobtainedfromtheempiri-caldistributionofsecond-stagematchscores.Theprob-abilityin(2)wasusedfortheP-samplehouseholdsas well.SectionIIChapter55-9FurtherStudyofPersonDuplicationinCensus2000U.S.CensusBureau,Census2000 Chapter6.A.C.E.RevisionIIEstimationTheA.C.E.RevisionIIDualSystemEstimate(DSE)method-ologywasdevelopedwiththefollowingobjectivesin mind:*Integrationofthecorrectionsformeasurementerrorssothatmeasurementerrorsidentifiedbyboththeevalua-tionsandtheduplicatestudyarenotover-corrected.*SeparateestimationforbothE-andP-samplepersonsbasedonwhetherornottheylinkedtoacensusenu-merationoutsidethesearcharea.*Flexibilityinthepost-stratificationdesign,becausethefactorsthataffectcorrectenumeration(asmeasuredbytheEsample)werenotnecessarilythesameasthosethataffectcoverage(asmeasuredbythePsample).*Adjustmentforcorrelationbias.ThischapterpresentshowthisadditionalinformationwasincorporatedintotheDSEforA.C.E.RevisionIIestimates. | |||
ThereaderisassumedtobefamiliarwiththebasicdualsystemmodelandhowitwasusedtoproducetheMarch2001A.C.E.estimates.SeeHaines(2001)foradetailed descriptionofthismethodologyandDavis(2001)fortheoriginaldualsystemestimationresults.ThischapterdescribestheapproachtoA.C.E.RevisionIIdualsystem estimation.Thechapterdiscussesestimationoftheterm accountingforpersonsinthecensuswhoarenotintheEsample.ThecorrectenumerationratefromtheE-sampledataisdescribed.Then,theestimationofthematchrate fromtheP-sampledataisaddressed.Thecensus,E-sample,andP-sampledataarecombinedtoformasingleDSEformula.Next,thepost-stratificationvariables usedfortheA.C.E.RevisionIIFullandRevisionSamples aredefined.Thechapterthendiscussesadjustmentforcorrelationbiasusingdemographicanalysissexratiosandconcludeswithadiscussionofsyntheticestimation.DUALSYSTEMESTIMATIONThebasicformofthedualsystemestimate(DSE)is: | |||
DSEC en'II'CE EP M (1)where Cen'=thecensuscount,excludeslateadds II=non-data-definedcensusrecords,excludeslateadds LA=lateadditionstothecensus,i.e.recordsincludedtoolateforA.C.E.processing; primarilyreinstatedcasesfromthehousingunitduplicationoperation CE=E-sampleweightedestimateofcensuscorrect enumerations E=E-sampleweightedestimateofcensustotalenumerations(includesinsufficient informationformatchingandfollowupcases, excludesnon datadefinedcasesandlateadds) | |||
P=P-sampleweightedestimateoftotalpersons M=P-sampleweightedestimateofmatchestocensuspersonsDSEswerecomputedseparatelywithinpost-strata.Apost-stratumisagroupofpeopledefinedbydemographicandgeographiccharacteristicswhoareassumedtohavethesameprobabilitiesofinclusioninthecensus.Post-strata canalsobedefinedtohaveequalprobabilitiesofcorrect enumerationinthecensus.TheDSEin(1)canbewrittenasafunctionofthefinalcen-suscount, Cen,whichincludeslateaddsandthefollowingthreerates: | |||
DSEC enr DDr CE r M (2)where r DD(C en'II')/C enisthecensusdata-definedrate.Thenumeratorexcludes lateadds,butthedenominator includeslateadds. | |||
r CE=CE/EistheE-samplecorrectenumerationrate r MM/PistheP-samplematchrate.Thethreeratescanbeinterpretedasestimatesofprob-abilities.Thus,withinpost-stratum,*r DDestimatestheprobabilitythatacensuspersonrecordhassufficient(andtimely)informationforinclusioninA.C.E.processing,*r CEestimatestheprobabilitythatanE-sampleuni-versepersonisacorrectenumeration,and | |||
*r MestimatestheprobabilitythatapersoninthePsampleisincludedinthecensus.SectionIIChapter66-1A.C.E.RevisionIIEstimationU.S.CensusBureau,Census2000 Theinterpretationof r Mmaybelessobviousthantheothertwo;itisthesample-weightedproportionof P-samplepersonswhowerealsofoundinthecensus.The generalindependenceassumptionunderlyingDSEisthat eitherthecensusortheA.C.E.inclusionprobabilitiesare thesame(botharenotrequired).Assumingcausalinde-pendence,thematchrate r Mestimatestheprobabilityofcensusinclusionforthepost-stratum.Equation(2)alsogivesaninterpretationofhowtheDSEconstructspopulationestimateswithinapost-stratum.*Multiplythecensuscount(Cen)bythedata-defined rate, r DD,toestimatethenumberofcensuspersonswhoaredata-definedand,therefore,eligibleforinclu-sionintheEsample.*Reducethisproductbymultiplyingitbytheestimatedprobabilityofcorrectenumeration, r CE*Increasethisresultbydividingitbytheestimatedprob-abilityofcensusinclusion, r MTheprimarytasksindevelopingDSEsatthepost-stratumlevelaretheestimationofthethreeratesinvolved.The estimate r DDisstraightforwardbecauseitisbasedon100-percentcensustabulations.Moredetailisprovidedfortheestimates r CE and r M,sincetheyaremorechalleng-ing.Thedifferentestimationtaskscanbetackledonetermatatime.Basically,thegoalistoestimatethenumeratorsanddenominatorsoftheterms r CE and r M.Since E,theesti-matednumberoftotalcensusdata-definedenumerations,isasimple,directsample-weightedestimate,thechal-lengesrelatemostlytodevelopingtheestimates CE , P , and M.TheestimationchallengesforA.C.E.RevisionIIfocusonaccountingfor:(i)informationfromtherevisedcodingoftheA.C.E.RevisionSample,(ii)informationfromtheA.C.E.RevisionIIstudyofcensusduplicates,and(iii)differentpost-stratificationschemesfortheFullEandPsamples.Themostdifficultissueis(ii).BeforeproceedingtoadetaileddiscussionoftheA.C.E.RevisionIIDSEcomponents,considerthegeneralnatureoftheestimator.WhilethebasicDSEshowninequation(1) wasappliedinthe1990PES(Hogan,1993),theMarch2001A.C.E.incorporatedthemodificationcalledPES-Cestimation.SeeHaines(2001)andMule(2001b)for details.ThisDSEhadthegeneralform: | |||
DSE CC en'II'CE EP nmP im M nmM om P om P im (3)wherethefollowingquantitiesareallP-sampleweightedestimatesforthegivenpost-stratum: | |||
M nm=estimateofmatchestocensuspersonsfor nonmovers M om=estimateofmatchestocensuspersonsfor outmovers P nm=estimateoftotalnonmovers P om=estimateoftotaloutmovers P im=estimateoftotalinmoversNonmovers,outmovers,andinmoversweredefinedwithreferencetotheirstatusintheperiodoftimebetweenCensusDay(April1,2000)andtheA.C.E.interview.Non-moverswerethosewhodidnotmoveduringthisperiod,outmoverswerethosepersonswhomovedoutofasampleblockduringthisperiod,andinmoversarethose whomovedintoasampleblockduringthisperiod.Equa-tion(3)estimatedP-samplematches(M)asthesumofestimatedmatchesamongnonmovers(M nm)andesti-matedmatchesamongmovers.Thenumberofmovermatcheswasestimatedastheproductofanestimated numberofmovers(P im)andanestimateofthemovermatchrate(M om/P om).Thus,P-sampleoutmoverswereusedtoestimatethemovermatchratewhileP-Sample inmoverswereusedtoestimatethenumberofmovers. | |||
Thisapproachimpliesthat P nm+P imshouldbeusedfortheestimatedtotalofP-samplepersons(P).Equation(3)canbefurtherexpandedtoincludepost-stratificationsubscripts.TheFullE-andP-samplepost-strataaredenotedbysubscripts i and j,respectively.Thecensustermwascalculatedforthecross-classificationof i and jpost-strata,denoted ij.TheDSEformula,usingpro-cedureCformovers,withdifferentpost-stratafortheEandPsamplesis: | |||
DSE C ijC en ijr DD, ij[CE i E i][M nm , j[M om , j P om , j]P im , j P nm , jP im , j]ESTIMATIONOF r DDRecallthegeneralformoftheDSEinequation(2).Thissectiondiscussestheestimationofthedata-definedrate,orDD-rate.TheDD-rateestimate (r DD)isdefinedas (C en'II'C enforagivendetailed ijpost-stratum,where C en',II', and Cenaredefinedfrom100-percentcensustabulations.Atthepost-stratumlevel, C enr DDreducesto C en'II'.Thissuggeststhatanalternativetocomputing r DDatapost-stratumlevelistocompute C en'II'foralllevels(e.g.,demographicpost-stratumgroupswithinsmallgeo-graphicareas)forwhichestimatesweretobecomputed,andthentoadjustthesequantitiesbytheappropriate r CE/r Mfactors.Thisapproachmaybeproblematic,espe-ciallywhenappliedtoverysmallareas.Theproblemwithdirectcomputationof C en'II'forverysmallareascanbeseenwiththefollowinghypothetical example.Supposeaparticularsmallgeographicarea(e.g.,6-2SectionIIChapter6A.C.E.RevisionIIEstimationU.S.CensusBureau,Census2000 acollectionofblocks)hasahighrateofimputationinthecensus,say15.0percent.Imputationrateswillvarygeo-graphically,andhighratescouldresultfromanumberof factors,suchasdifficultiesgettingaccesstohousingunits insecurecommunitiesordifficultiesinhiringandretain-ingcensusenumeratorsinaparticulararea.Inthishypo-theticalexample,removingallimputationsfromthecen-suscountfortheareabycomputing C en'II'wouldreducethecensuscountby15.0percent.Subsequentmul-tiplicationbythe r CEr Mfactorsandsummingtheresult-ingDSEsoverpost-stratamayincreasethepopulationesti-matefromthisbase,butperhapsbynomorethantwoor threepercent(dependingonthepost-stratumcompositionofthearea).ThenetsyntheticDSEwould,thus,be12.0or13.0percentlowerthanthecensuscount.Whilethisesti-matecouldmakesenseifalmostallthehousingunitsforwhichpersonswereimputedwereactuallyvacant(andthisfactwerenotdiscoveredinthecensusenumeration), | |||
itwouldnotmakesenseifmostoftheunitswereoccu-piedandthehighrateofimputationresultedfromotherfactorssuchasthosesuggestedabove.Calculating r DD forpost-strataandapplyingitsyntheticallyavoidssuchprob-lemsinsmallareaestimates,thoughperhapsincurring someerrorforlargerareasforwhichthedirecttabulation of C en'II'wouldbesensible.Thedata-definedrates, r DD,arecomputedatthedetailedpost-stratumobtainedastheintersectionoftheE-andP-samplepost-strata.ESTIMATIONOF r CEThissectiondiscussestheestimationofthecorrectenu-merationrate,r CE=CE/E.TheFullE-samplepost-strataaredenotedbythesubscript i.TheRevisionEsamplehaspost-stratadenotedby i',where i'isbasedoncollapsed post-strata i.ThismeansthattheRevisionSamplepost-stratawereobtainedbycollapsingtheFullSamplepost-strata i.Thecorrectenumerationrateiswritten: | |||
r CE, iCE i ND f l , i'CE~i D E i (4)NotethatthenumeratortermseparatestheE-sampleenu-merationswithaduplicatelinktoacensusenumeration outsidetheA.C.E.searcharea,asidentifiedinthedupli-catestudy,fromthoseenumerationswithoutalink.AsdiscussedinChapter5,theduplicatestudyused computer-basedrecordlinkagetechniquestomatchtheFullP-andE-samplestocensusenumerationsoutsidethesearcharea.Thecensusenumerationsincludedthoseenu-merationsthatwereaddedtoolatetobeincludedintheE sample,aswellasthoseenumerationsthatweredeter-minedtobeduplicatesand,therefore,wereneverincludedinthecensus.Theterm CE i NDestimatesthenumberofcorrectenumera-tionsintheFullEsamplewithoutduplicatelinksinpost-stratum i.Thistermincludestheprobabilityofnotbeinga duplicate, 1-p t.Thecomponent CE~i DrepresentstheestimatednumberofcorrectenumerationsintheFullEsamplewithduplicate linksinpost-stratum i,whichareretainedafterunduplica-tion.Thistermincludestheprobabilityofbeingadupli- | |||
cate, p t,aswellastheconditionalprobabilitythatanE-samplecaseisacorrectenumerationgiventhatitisaduplicatetoanothercensusenumerationoutsidethe A.C.E.searcharea.Thetotalweightednumberofpersonsinpost-stratum i intheEsamplearedenotedby E i.Thedouble-samplingratiofactor f 1, i'correctsformeasure-menterrorbasedontheRevisionEsample.Itisaratioofanestimatethatusestherevisedcoding(indicatedby*) | |||
toanestimatethatusestheoriginalcoding.Theseadjust-ments,whicharecalculatedformeasurementerrorpost-strata i',arerepresentedby: | |||
f l , i'CE i'ND*E i'ND CE i'ND E i'NDCE i'ND*CE i'ND.P-andE-samplecaseswithduplicatelinkswereassignedanonzeroprobabilityofbeingaduplicate, p t.P-andE-samplecaseswithoutduplicatelinkswereassigneda p tvalueofzero.Thisprobabilityisusually0or1forE-andP-samplecases,butsomeduplicatelinkshaveavaluein between,indicatinglessconfidencethatthelinkisrepre-sentingthesameperson.TheseprobabilitiesarealsotransferredtotheE-andP-RevisionSamples.AlthoughtheduplicatestudyidentifiedE-andP-samplecaseslinkingtocensusenumerationsoutsidetheA.C.E.searcharea,thisstudycouldnotdeterminewhichcompo-nentofthelinkwasthecorrectonesincenoadditionaldatawerecollectedforthispurpose.Assumingthatthelinkedpersondoesexist,thegoalistodeterminewhichof thetwolocationsistheappropriateplacetocounttheper-son.Sincelinkedpersonsmaybegeographicallycloseorfarapart,thishasimplicationsforthedegreeofsyntheticerror.OntheE-sampleside,thisstudydoesnotidentify whetherthelinkedE-samplecaseisthecorrectenumera-tion.Thus,itisnecessarytoestimatethefollowingcondi-tionalprobability: | |||
z ttheprobabilitythatanE-samplecaseisacorrectenumerationgiventhatitisaduplicatetoanothercensusenumerationoutsidetheA.C.E.searcharea.E-SampleLinksFromtheduplicatestudy,anestimateofcorrectcensusenumerationscanbederivedbyconsideringthesituationofthelinkedenumerations,aswellasassumingthateach linkrepresentsonecorrectenumeration.Thisassumes,ofcourse,thatthelinkconsistsoftrueduplicates.TheseSectionIIChapter66-3A.C.E.RevisionIIEstimationU.S.CensusBureau,Census2000 assumptionsareusedtoestimatethecontributiontocor-rectenumerationsfromFullE-samplecaseswithduplicate links,includingthoseoriginallycodedascorrect,aswell asthoseoriginallycodedaserroneous.Thiscontributiontocorrectenumerationsisgivenbytheterm: | |||
CE~i D.Toesti-matethisterm,theE-samplelinksarefirstclassifiedaccordingtothecharacteristicofthelinkedsituationand theoriginalcodingoftheEsample.Attachment1summa-rizesthisclassificationandtherulesforassigning z ts.First,linkedsituationsareidentifiedwhereonecompo-nentofthelinkisthoughttobecorrectandtheotherincorrect.Ifapersoninahousingunitlinkswithapersoninagroupquarters,suchasacollegedormitory,theper-soninthehousingunitistakentobeincorrectandassigneda z tofzero.SeeLinkedSituation1.inAttachment1.Ifalinkedperson18yearsofageorolderislistedin onlyoneofthehouseholdsasachildofthereferenceper-son,thispersonisassumedtobeincorrectlyincludedwiththeirparentsandcorrectlyincludedintheother household,unlessA.C.E.hadalreadydeterminedthemtobeanerroneousinclusion.Anexampleofthismightbeacollegestudentthatwaslistedwiththeirparentsandalso listedinanoff-campusapartment.Thisisrepresentedby LinkedSituations2a.and2b.inAttachment1.ForotherLinkedSituations,thechoiceofwhichpersoniscorrectisnotclear.Considerlinksbetweenwholehouse-holdswhereallhouseholdmembersareduplicated(LinkedSituation3.).Thisincludesfamiliesthatmight havemovedsometimearoundCensusDayandwereinad-vertentlyincludedatbothplacesorthismightinvolvehouseholdswithmultipleresidenceswithahelpful,but perhaps,uninformedproxyrespondent.Anothersituation,LinkedSituation4.,involveschildrenages0to17,per-hapsofdivorcedparents,thatarelinkedbetweentwodif-ferenthouseholds.Fortheseandallothersituations,itis assumedthatonlyhalfofthesecensusenumerationswithduplicatelinksarecorrect.Toestimatetheconditionalprobability, z t,thattheE-samplepersonisthecorrectenu-meration,controlscellsaredefinedforLinkedSituations3.,4.,and5.,asindicatedinAttachment1,by:*3Race/HispanicOriginDomains*Tenure TheseresultingcontrolcellsaregiveninAttachment2.Withineachcontrolcellthe z tsaredeterminedsuchthatduplicateE-samplecases,originallycodedcorrectorunre-solved,willweightuptoonehalfthenumberofcensus duplicatesidentified,includingtheerroneousenumera-tions.Thisiscalculatedas: | |||
zt0.5t W t p tt W t p t P rCEThesummationsareoverthelinksinacontrolcellregard-lessoftheoriginalE-samplecoding.Thecomponentsofequation(4)aredefinedbelow. | |||
CE~i Dti WE , t p t z t PR ce, tistheestimatednumberofcorrectenumerationswithduplicatelinksinpost-stratum iwhowereretainedafter unduplication. | |||
CE i NDti W, t E1p tPR ce, tisthenumberofcorrectenumerationswithoutduplicatelinksinpost-stratum i,wherethesummationistakenoverallenumerationsintheA.C.E.Esampleinpost-stratum i.W, t EistheproductionA.C.E.samplingweightforE-sampleperson t.p tistheprobabilitythatperson thasaduplicatelinkoutsidethesearcharea.Thisisusually0or1,butcouldbebetweenthesetwovaluesfor probabilitymatches,wheretheaccuracyofthe linkwasuncertain. | |||
PR ce, tistheprobabilitythatperson tisacorrectenu-merationintheoriginalproductioncoding.This iseither0or1unlessitwasnotpossibletocode theE-samplecaseacorrectorerroneousenu-meration.Inthesecases,aprobabilityofcorrectenumerationwasimputed. | |||
f l , i'CE i'ND*CE i'NDti'W RR, t E1p tPR ce R, tti'W R, t E1p tPR ce, twhere W RR, t EistheA.C.E.RevisionSampleweightforper-son ttobeusedforRevisionSamplecoding. | |||
W R, t EistheA.C.E.RevisionSampleweightforper-son ttobeusedwithproductioncoding.ThesetwoweightscoulddifferslightlydependingonTESstatusandnoninterview | |||
adjustment. | |||
PR ce R, tistheprobabilitythatperson tisacorrectenumerationintheA.C.E.RevisionSample coding.E iti W, t EisthetotalweightednumberofpersonsintheEsamplein post-stratum i.ESTIMATIONOF r MThissectiondiscussestheestimatedmatchrateinequa-tion(2).E-samplepost-strataareindexedby i,whiletheP-samplepost-strataareindexedby j.Thematchratefor post-stratum jisrepresentedas: | |||
r M, jM nm , j ND f 2, j'M~nm , j D[M om , j f 3, j'P om , j f 4, j'](P im , j f 5, j'g (P nm , j DP~nm , j D))P nm , j ND f 6, j'P~nm , j DP im , j f 5, j'g (P nm , j DP~nm , j D)(5)6-4SectionIIChapter6A.C.E.RevisionIIEstimationU.S.CensusBureau,Census2000 TheresidencestatusofP-samplemoverswasadjustedforcodingerror.Thecomputermatchingresultswerenot used.OutmoversinthePsamplewerecollectedbya proxyinterview,whichmadeitdifficulttoobtaindateof birthandageinformation.Sincedateofbirthandage wereimportantcharacteristicsusedinthecomputer matching,themoverswereonlyadjustedforcodingerror.AlthoughtheduplicatestudyidentifiedE-andP-samplecaseslinkingtocensusenumerationsoutsidetheA.C.E.searcharea,thisstudycouldnotdeterminewhichcompo-nentofthelinkwasthecorrectone,sincetherewereno additionaldatacollectedtodeterminethis.Assumingthatthelinkedpersondoesexist,thegoalistodeterminewhichofthetwolocationsistheappropriateplaceto counttheperson.Sincelinkedpersonsmaybegeographi-callycloseorfarapart,thishasimplicationsforthedegreeofsyntheticerror.OntheP-sampleside,thisstudydoesnotidentifywhetherthelinkedP-samplecaseisaresidentonCensusDay.Thus,itisnecessarytoestimatethefollowingconditional probability: | |||
h tistheprobabilitythataP-samplecaseisaresidentonCensusDaygiventhatitlinkstoacensusenumera-tionoutsidetheA.C.E.searcharea.P-SampleLinksUnliketheE-sampleside,theduplicatestudydoesNOTprovideanestimateofthenumberofcorrectCensusDayresidentsinthePsample.Inordertoestimate h ttheprob-abilitythataP-samplecaseisaresidentonCensusDaygiventhatitlinkstoacensusenumerationoutsidethe searcharea,itisnecessarytoborrowtheresulting z tsfromtheE-samplelinks.Attachment1summarizeshow | |||
the h tsborrowinformationfromthe z ts.First,theP-samplelinkstocensusenumerationsoutsidethesearchareaareidentifiedforsituationswhereitcanbedeterminedwhichcomponentofthelinkisthecorrect residence.TheLinkedSituationsandrulesforassigning h tsarethesameasthoseusedforcomparabletypesofE-samplelinks.Forexample,consideraP-sampleperson18yearsofageorolder,listedasachildofthereferencepersonwholinkswithacensusenumerationinahouse-holdwheretheyarenotlistedasachild.ThisP-samplepersonwouldbeassignedan h tofzeroregardlessofhowA.C.E.codedthisperson.Thus,itisassumedthatthisper-sonshouldnothavebeenincludedinthePsample.FortheotherLinkedSituations3.,4.,and5.,thereonceagainisnoinformationtodeterminewhetherthePsample hadthepersonatthecorrectlocationorwhetherthecen-sushadthematthecorrectlocation.Additionally,thereisnoreasonableassumptionabouthowmanyofthese linkedP-samplepersonsshouldbeatthecorrectlocation.Toovercomethisobstacle,itisassumedthattheerrorinidentifyingcorrectresidenceissimilartotheerroriniden-tifyingcorrectenumerationforsimilarsituations.There-fore,the h tforP-samplepersonsissetequaltothe z tdeter-minedfortheEsampleforcomparablelinkedsituations asidentifiedbythecontrolcellsinAttachment2.The h tsarethenincludedintheweightedtallies,alongwiththe p t ,tocalculatetheduplicatecontributiontotheFullP-sample nonmoversandnonmovermatches.Thetermsinequation(5)aredefinedbelow.Summation tjdenotessummationoverA.C.E.FullP-Samplepost-stratumj,whilesummation tj'denotessummationoverRevisionSamplepost-stratum j'.Thesummationnotationalsoindicateswhetherthesumistakenovernonmovers,outmovers,orinmovers,andiftheProduction | |||
()orRevi-sion(R)Samplecodingisused. | |||
M nm , j NDW, t P (1p t)PR res, t PR m, t tjtnonmover productionwhere W, t PistheP-sampleproductionweightofperson t.p tistheprobabilitythatperson thasaduplicatelinkoutsidethesearcharea. | |||
PR m, tistheprobabilitythatperson tisamatchintheproductioncoding. | |||
PR res, tistheprobabilitythatperson tisaresidentintheproductioncoding. | |||
f 2, j'M nm , j'ND*M nm , j'NDW RR, t P1p tPR res R, t PR m R, t tj'tnonmover revisionW R, t P1p tPR res, t PR m, t tj'tnonmover productionisthedouble-samplingadjustmentfornonmovermatches. | |||
PR m R, tistheprobabilitythatperson tisamatchintheRevisionSamplecoding. | |||
PR res R, tistheprobabilitythatpersontisaresidentintheRevisionSamplecoding. | |||
W RR, t PistheA.C.E.RevisionSampleweightforperson ttobeusedforRevisionSamplecoding. | |||
W R, t PistheA.C.E.RevisionSampleweightforperson ttobeusedwithproductioncoding.ThesetwoweightscoulddifferslightlydependingonTESstatusandthenoninterviewadjustment. | |||
M om , jW, t P PR res, t PR m, t tjtoutmover productionisthenumberofmatchedoutmoversintheFullSamplein post-stratum j.SectionIIChapter66-5A.C.E.RevisionIIEstimationU.S.CensusBureau,Census2000 f 3, j'M om , j'*M om , j'W RR, t P PR res R, t PR m R, t tj'toutmover revisionW R, t P PR res, t PR m, t tj'toutmover productionisthedouble-samplingratioformatchedoutmoversfor post-stratum j'.P om , jW, t P PR res, t t, jtnonmover productionisthenumberofoutmoversintheFullSampleforpost-stratum j.f 4, j'P om , j'*P om , j'W RR, t P PR res R, t tj'toutmover revisionW R, t P PR res, t tj'toutmover productionisthedouble-samplingratioforoutmoversforpost-stratum j'.P im , jW, t P tjtnonmover productionisthenumberofinmoversintheFullSamplepost-stratum j.f 5, j'P im , j'*P im , j'W RR, t P PR inmover R, t tj'tinmover revisionW R, t P tj'tinmover productionisthedouble-samplingratioforinmoversforpost-stratum j'.PR inmover R, tistheprobabilitythatperson tintheRevi-sionSampleisaninmover. | |||
g (P nm , j DP~nm , j D)Theterm gadjuststhenumberofinmoversforthoseFullP-samplenonmoverswhoaredeterminedtobenonresidentsbecauseofduplicatelinks.Someofthesenonresidentsarenonresidentsbecausetheyareinmoversandshouldbeadded tothecountofinmovers.Theterm P nm , j DP~nm , j Disanestimateofnonresidentsamongnonmoverswithduplicatelinks.Thistermismultipliedby g ,whichisanestimateoftheproportionoforiginally-codednon-moverswithduplicatelinkswhoaretruenonresidentsthathave movedinsinceCensusDay.Theterm gisestimatedusingtheRevisionSampleandboththeoriginalA.C.E.andtherevision codingasfollows: | |||
gP nm , im*D P nm , nr*D P nm , im*Disanestimateofpersons(usingtheRevisionPsample)withaduplicatelinkwhowereoriginallycodedasnonmoversbuttherevisioncodingdeter-minedthemtobeinmovers(asubsetofnonresi-dents).P nm , nr*Disanestimateofpersons(usingtheRevisionPsample)withaduplicatelinkwhowereoriginallycodedasnonmoversbuttherevisioncodingdeter-minedthemtobenonresidents.Acoupleofimportantassumptionsare:*Iftherevisioncodingdeterminedthatapersonwasanonresident,theyreallyareanonresident.Thatis, revision-codednonresidentsareassumedtobeasubsetoftruenonresidents.*Therateofinmoversforrevision-codednonresidentsisthesameasthatfortruenonresidents. | |||
M~nm , j DW, t P p t h t PR m, t PR res, t tjtinmover productionisthenumberofduplicatepersonsdeterminedtohavebeenCensusDayresidentswhomatchedtothe censusinpost stratum j.P nm , j NDW, t P1p tPR res, t tjtinmover productionisthenumberofnonmoverswithoutlinksoutsidethesearchareainpost-stratum j.f 6, j'P nm , j'ND*P nm , j'NDW RR, t P1p tPR res R, t tj'tnonmover revisionW R, t P1p tPR res, t tj'tnonmover productionisthedouble-samplingadjustmentfornonmoversinpost-stratumj'.6-6SectionIIChapter6A.C.E.RevisionIIEstimationU.S.CensusBureau,Census2000 P~nm , j DW, t P p t h t PR res, t tjtnonmover productionistheestimatednumberofnonmoverpersonswithdupli-catelinkswhowereresidentsafterunduplication. | |||
P nm , j DW, t P P t PR res, t tjtnonmover productionisthenumberofP-samplepersonswithduplicatelinks,regardlessofwhethertheyweredeterminedtoberesi-dentsbytheunduplicationprocess.THEA.C.E.REVISIONIIDSEFORMULATheA.C.E.RevisionIIDSEformula,usingprocedureCformovers,separateE-andP-samplepost-strata,measure-menterrorcorrectionsfromtheE-andP-RevisionSamples,andduplicatestudyresultsis: | |||
DSE ij CC en ijr DD, ij[CE i ND f 1, i'CE~i D E i][M nm , j ND f 2, j'M~nm , j D[M om , j f 3, j'P om , j f 4, j']P im , j f 5, j'gP nm , j DP~nm , j DP nm , j ND f 6, j'P~nm , j D+P im , j f 5, j'gP nm , j DP~nm , j D]NotationTermsCECorrectenumerationsEE-sampletotal MMatches PP-sampletotal fAdjustsformeasurementerror | |||
gAdjustsnonmoverstomoversdueto duplicationSubscriptsi,jFullEandPpost-stratai',j'RevisionEandPmeasurementerrorcorrectionpost-stratanm,om,imnonmover,outmover,inmoverSuperscriptsCDSEprocedureCformoversNDNotaduplicatetocensusenumerationoutsidesearchareaDDuplicatetocensusenumerationoutsidesearch areaIncludesprobabilityadjustmentforresidencygivenduplicationInsomesmallpost-strata,thenumberofinmoverswassubstantiallylargerthanthenumberofoutmovers.Ifthere wereonlyafewoutmovers,theoutmovermatchratewassubjecttohighsamplingerror.Inthesepost-strata,itwasnotconsideredappropriatetoapplyasuspectmatchrate towhatcouldbearelativelylargenumberofinmovers,soPES-Awasused.PES-Ausesonlyoutmovers.PES-Awasappliedforpost-stratawithnineorfewerP-sampleout-movers.Forthesepost-strata,itwasassumedthatsome oftheduplicatelinksdeterminednottohavebeenresi-dentswerereallyoutmovers.TheDSEformulathatusesprocedureAformoverswithdifferentpost-stratafortheE-andP-samplesis: | |||
DSE ij AC en ijr DD, ijCE i E i[M nm , jM om , j P nm , jP om , j]TheA.C.E.RevisionIIDSEformula,usingprocedureAformovers,separateE-andP-samplepost-strata,measure-menterrorcorrectionsfromtheE-andP-RevisionSamples,andduplicatestudyresultsiswritten: | |||
DSE ij AC en ijr DD, ij[CE i ND f 1, i'CE~i D E i][M nm , j ND f 2, j'M~nm , j DM om , j f 3, j'gM nm , j DM~nm , j DP nm , j ND f 6, j'P~nm , j DP om , j f 4, j'gP nm , j DP~nm , j D]ThisversionoftheformulaisusedonlywhenthesamplesizeforoutmoversintheFullPsampleisstrictlylessthan10.Thisformulawasused93timesintheA.C.E.RevisionIIproductionprocess.Thenewtermintroduced inthisformulaisdefinedasfollows: | |||
M nm , j DW, t p p t PR res, t PR m, t tjtnonmover productionisthenumberofmatchedP-samplepersonswithduplicatelinks,regardlessofwhethertheyweredeterminedtoberesidentsbytheunduplicationprocess.A.C.E.REVISIONIIPOST-STRATIFICATIONDESIGNTheFullE-andP-sampleswiththeoriginalcodingresultsthatwereusedtoproducetheMarch2001estimatesofcensuscoverageprovidedthebasisoftheA.C.E. | |||
RevisionIIestimates.TheMarch2001A.C.E.estimatesweredeterminedtobeunacceptablebecauseofthepres-enceoflargeamountsofmeasurementerror.TheseFull sampleswerecomprisedofover700,000samplepersons each.Insteadofonesetofpost-stratificationvariables,theA.C.E.RevisionIIestimatesincludeseparatepost-stratafortheFullEandPsamples,indicatedbysubscripts i and j ,respectively.FullPSampleFortheFullPsample,thenewpost-stratawerenearlyidenticaltothoseusedfortheMarch2001A.C.E.esti-mates.Theonlydifferencewasthatthe0-17agegroupSectionIIChapter66-7A.C.E.RevisionIIEstimationU.S.CensusBureau,Census2000 wassplitintotwogroups,0-9and10-17,whichresultedinsomecollapsingdifferences.TheFullPsample,consist-ingof480post-strata,wasbasedonthefollowingcharac-teristics(asopposedtotheprevious416post-strata):*Race/HispanicOriginDomain | |||
*Tenure*SizeofMetropolitanStatisticalArea*TypeofCensusEnumerationArea*ReturnRateIndicator(Lowvs.High)*Region | |||
*Age*Sex FortheFullPsample,thepost-stratumgroupseitherretainedalleightAge/SexcategoriesorwerecollapsedtofourAge/Sexcategoriesasshownbelow:Figure6-1.P-SampleAge/SexGroupings Age8groups4groups1group*MaleFemaleMaleFemaleMaleFemale 0-9 10-17 18-29 30-49 50+*The1groupisnotusedfortheFullP-samplepost-strata(j),onlytheRevisionP-samplepost-strata(j').Table6-1showsthe64FullP-samplepost-stratumgroups.Thenumberineachcellrepresentsthenumberof Age/Sexcategoriesineachpost-stratumgroup.6-8SectionIIChapter6A.C.E.RevisionIIEstimationU.S.CensusBureau,Census2000 Table6-1.FullP-SamplePost-StratumGroupsandNumberofAgeandSexGroupings(j)Race/HispanicorigindomainnumberTenureMSA/TEAHighreturnrateLowreturnrateNEMWSWNEMWSWDomain7(Non-HispanicWhiteor Someotherrace) | |||
OwnerLargeMSAMO/MB88888484MediumMSAMO/MB88884888SmallMSA&Non-MSAMO/MB88884888AllotherTEAs88888888 NonownerLargeMSAMO/MB88MediumMSAMO/MB88 SmallMSA&Non-MSAMO/MB88 AllotherTEAs88Domain4(Non-HispanicBlack) | |||
OwnerLargeMSAMO/MB 88MediumMSAMO/MBSmallMSA&Non-MSAMO/MB 88AllotherTEAs NonownerLargeMSAMO/MB 88MediumMSAMO/MBSmallMSA&Non-MSAMO/MB 84AllotherTEAsDomain3 (Hispanic) | |||
OwnerLargeMSAMO/MB 88MediumMSAMO/MBSmallMSA&Non-MSAMO/MB 88AllotherTEAs NonownerLargeMSAMO/MB 88MediumMSAMO/MBSmallMSA&Non-MSAMO/MB 84AllotherTEAsDomain5(NativeHawaiianorPacificIslander) | |||
Owner 4 Nonowner 4Domain6(Non-HispanicAsian) | |||
Owner 8 Nonowner 8 AmericanIndianor Alaska NativeDomain1 (On Reservation) | |||
Owner 8 Nonowner 8Domain2(Off Reservation) | |||
Owner 8 Nonowner 8SectionIIChapter66-9A.C.E.RevisionIIEstimationU.S.CensusBureau,Census2000 FullESampleFortheA.C.E.RevisionIIFullEsample,thepost-stratadefinitionshaveundergonemajorrevisions.Someofthe originalpost-stratificationvariableswereomittedand additionalvariableswereadded.Logisticregressionmod-elsidentifiedseveralvariables,notincludedintheFull P-samplepost-stratification,thatweregoodindicatorsof correctenumeration.TheFullEsample,consistingof525 post-strata,wasdefinedusingthefollowing | |||
characteristics:*ProxyStatus | |||
*Race/HispanicOriginDomain*Tenure*HouseholdRelationship*HouseholdSize*TypeofCensusReturn(mailbackvs.nonmailback) | |||
*DateofReturn(earlyvs.late)*Age*SexThenewvariablesproxystatus,householdrelationshipandsize,andtype(mailback/nonmailback)anddate (early/late)ofcensusreturnaredescribedgenerallybelow. | |||
*ProxyStatus.Nonproxyincludesthosehousingunitpersonsforwhomcensusdatawereprovidedbyahouseholdmember.Proxyincludesthosehousingunitpersonsforwhomcensusdatawereprovidedbyanon-householdmember,suchasaneighbororrentalagent. | |||
*HouseholdRelationship.TheHouseholder/Nuclear(HHer/Nuclear)relationshipcategoryincludespersonsinhousingunitsconsistingonlyofthehouseholderwith spouseorownchildren(l7oryounger).TheOtherrelationshipcategoryconsistsofsingle-personhouse-holdsandpersonsinhousingunitswithanyothertype ofrelationship,includingunrelatedpersons. | |||
*HouseholdSize.Householdsize,ornumberofper-sonsresidinginthehousingunit. | |||
*Early/LateMailback.Personsinmailbackhousingunitswithanearliestformprocessingdate.OnorbeforeMarch24isearlyandafterMarch24islate. | |||
*Early/LateNonmailback.Personsinnonmailbackhousingunitswithanearliestformprocessingdate.OnorbeforeJune1isearlyandafterJune1islate.FortheFullEsample,thepost-stratumgroupseitherretainedalleightAge/Sexcategoriesorwerecollapsedto four,two,oroneAge/Sexgroups,basedonsamplesizes,asshownbelow:Figure6-2.E-SampleAge/SexGroupings Age8groups4groups2groups1groupMaleFemaleMaleFemaleMaleFemaleMaleFemale 0-9 10-17 18-29 30-49 50+Table6-2showsthe93FullE-samplepost-stratumgroups.ThenumberineachcellrepresentsthenumberofAge/Sexcategoriesineachpost-stratumgroup.6-10SectionIIChapter6A.C.E.RevisionIIEstimationU.S.CensusBureau,Census2000 Table6-2.FullE-SamplePost-StratumGroupsandNumberofAgeandSexGroupingsProxystatus&domainTenureRelationshipHHSize Early mailback Late mailback Early non-mailbackLatenon-mailbackProxy:Domain7(Non-HispanicWhiteorSomeOtherRace)8Proxy:Domain4(Non-HispanicBlack) 8Proxy:Domain3(Hispanic) 8Proxy:Domain5(NativeHawaiianorPacificIslander) 1Proxy:Domain6(Non-HispanicAsian) 4Proxy:Domain1(AmericaIndianorAlaskaNativeOnReservation)4 Proxy:Domain2(AmericanIndianorAlaskaNativeOffReservation)1 Nonproxy:Domain7 (Non-HispanicWhiteorSomeOtherRace) | |||
Owner HHer/Nuclear2-388884+8848 Other122122-38824 4+8848 NonownerHHer/Nuclear8888Other8888 Nonproxy:Domain4 (Non-HispanicBlack) | |||
OwnerHHer/Nuclear4424Other8848 NonownerHHer/Nuclear8888Other8888 Nonproxy:Domain3 (Hispanic) | |||
OwnerHHer/Nuclear8848Other8848 NonownerHHer/Nuclear8888Other8888 Nonproxy:Domain5(NativeHawaiianorPacificIslander)Owner&NonownerHHer/Nuclear2222Other2212 Nonproxy:Domain6 (Non-HispanicAsian)Owner&NonownerHHer/Nuclear8844Other4424 Nonproxy:(AmericanIndianorAlaskaNative)Domain1OnReservationOwner&NonownerHHer/Nuclear8Other8Domain2OffReservationOwner&NonownerHHer/Nuclear2222Other2212SectionIIChapter66-11A.C.E.RevisionIIEstimationU.S.CensusBureau,Census2000 RevisionPSampleTheRevisionPsampleisasubsampleoftheFullPsampleandiscomprisedofover60,000samplepersons.The RevisionPsamplehasbeensubjectedtoanadditional fieldinterviewand/orrematchingoperationaspartofthe originalA.C.E.evaluationprogram.Insupportofthe A.C.E.RevisionIIprogram,theRevisionPsamplehas undergoneextensiverecodingusingallavailableinterview dataandmatchingresults.Missingdataadjustmentshave alsobeenappliedtotheRevisionPsample.Thisrecoded dataareusedtocorrectformeasurementerrorintheFull Psample.Themeasurementerrorcorrectionpost-stratumdefinitions (j')dependonapersonsmoverstatus.BothinmoversandoutmoversaresubdividedintoOwnerandNonowner groups.Fornonmovers,themeasurementerrorcorrection post-strataare:AmericanIndiansonReservations(AIR) and,fortheNon-AIRcases,acrossofTenure(Ownerver-susNonowner)witheightAgeandSexcategories.The Age/SexcollapsingpatternfromtheFullPsampleis retainedwhendefiningthemeasurementerrorcorrection post-strata.TheRevisionP-samplepost-strata(j')aredefinedasfollows:Figure6-3.RevisionP-SamplePost-Strata(j')MoverStatus&DomainTenureAge8groups1groupMaleFemale Movers:Domains1thru7 Owner Nonowner Nonmovers:Domains2thru7 Owner 0-9 N/A 10-17 18-29 30-49 50+Nonowner 0-9 N/A 10-17 18-29 30-49 50+Nonmovers:Domain1(AmericanIndianorAlaskaNativeOnReservation)N/Ameansnotapplicable.6-12SectionIIChapter6A.C.E.RevisionIIEstimationU.S.CensusBureau,Census2000 RevisionESampleTheRevisionEsampleisasubsampleoftheFullEsampleandiscomprisedofover75,000samplepersons.The RevisionEsamplehasbeensubjectedtoanadditional fieldinterviewand/orrematchingoperationaspartofthe originalA.C.E.evaluationprogram.Insupportofthe A.C.E.RevisionIIprogram,theRevisionEsamplehasundergoneextensiverecodingusingallavailableinterviewdataandmatchingresults.Missingdataadjustmentshave alsobeenappliedtotheRevisionEsample.TheserecodeddataareusedtocorrectformeasurementerrorintheFullESample.FortheRevisionEsample,themeasurementerrorcorrec-tionpost-strataare:Proxies,AmericanIndiansonReserva-tions(AIR)and,fortheNonproxy/Non-AIRcases,across ofatwo-levelRelationshipvariablewitheightAge/Sexcategories.NotethatHouseholdSizeiscollapsedoutoftheHouseholdRelationship/Sizevariable.TheAge/SexcollapsingpatternfromtheFullEsampleisretainedwhendefiningthemeasurementerrorcorrectionpost-strata.The RevisionEsamplepost-strata(i')aredefinedasfollows:Figure6-4.RevisionE-SamplePost-Strata(i')ProxyStatus&DomainRelationshipAge8groups1groupMaleFemale Proxy:Domain7(Non-HispanicWhiteorSomeOtherRace) | |||
Domain4(Non-HispanicBlack) | |||
Domain3(Hispanic) | |||
Domain5(NativeHawaiianorPacificIslander) | |||
Domain6(Non-HispanicAsian) | |||
Domain1(AmericanIndianorAlaskaNativeOnReservation) | |||
Domain2(AmericanIndianorAlaskaNativeOffReservation) | |||
Nonproxy:Domains2thru7 HHer/Nuclear 0-9 N/A 10-17 18-29 30-49 50+Other 0-9 N/A 10-17 18-29 30-49 50+Nonproxy:Domain1(AmericanIndianorAlaskaNativeOnReservation)N/Ameansnotapplicable.SectionIIChapter66-13A.C.E.RevisionIIEstimationU.S.CensusBureau,Census2000 ADJUSTMENTFORCORRELATIONBIASUSINGDEMOGRAPHICANALYSISThedualsystemestimatesareadjustedtocorrectforcor-relationbias.Correlationbiasexistswhenevertheprob-abilitythatanindividualisincludedinthecensusisnotindependentoftheprobabilitythattheindividualis includedintheA.C.E.Thisformofbiasgenerallyhasadownwardeffectonestimates,becausepeoplemissedinthecensusmaybemorelikelytoalsobemissedinthe A.C.E.Estimatesofcorrelationbiasarecalculatedusingthetwo-groupmodelandsexratiosfromDemographicAnalysis(DA).Thesexratioisdefinedasthenumberof malesdividedbythenumberoffemales.Thismodelassumesnocorrelationbiasforfemalesorformalesunder18yearsofage;nocorrelationbiasadjustmentfor non-Blackmalesaged18-29;andthatBlackmaleshavearelativecorrelationbiasthatisdifferentthantherelativecorrelationbiasfornon-Blackmales.Thecorrelationbias adjustmentisalsodonebythreeagecategories:18-29, 30-49,and50andover.Thismodelfurtherassumesthatrelativecorrelationbiasisconstantovermalepost-stratawithinagegroups.TheRace/HispanicOriginDomainvari-ableisusedtocategorizeBlackandnon Black.TheDAtotalsareadjustedtomakethemcomparablewithA.C.E.Race/HispanicOriginDomains.BlackHispanicsaresubtractedfromtheDAtotalforBlacksandaddedtothe DAtotalfornon-Blacks.ThisisdonebecausetheA.C.E.assignsBlackHispanicstotheHispanicdomain,nottheBlackdomain.Thesecondadjustmentdeletesgroupquar-terspeoplefromtheDAtotalsusingCensus2000data. | |||
ThereasonformakingthisadjustmentisthatthegroupquarterspopulationisnotpartoftheA.C.E.universe.Afinaladjustmentthatcouldbemadewouldbetoremove theRemoteAlaskapopulationfromtheDAtotals,sinceittooisnotpartoftheA.C.E.universe.Sincethispopulationissmall,theDAsexratioswouldnotbeaffectedinany meaningfulway.TheresultingDAsexratiosforthethreeagegroupsbyBlackandnon-BlackdomainareshowninAttachment3.Ingeneralthecorrelationbiasadjustmentfactor, c k,isdefinedfor k=3agegroupssuchthat: | |||
E[c k DSE k m]Truemalepopulationforagegroup k ,where DSE k misthesumofDSEsovermalepost-stratainagegroup k.SincethepurposeofthisadjustmentistoreflectpersonsmissedinboththecensusandtheA.C.E.,thevalueof c kwasnotallowedtobelessthanone.CorrelationBiasAdjustmentforBlackandNon-BlackMales18YearsandOlderThecorrelationbiasadjustmentforBlackandnon-Blackmales18yearsandolderisdonesothattheA.C.ERevi-sionIIsexratioswillagreewiththeDAsexratiosforBlacksandnon-Blacks.Thiscorrelationbiasadjustmentiscalculatedas: | |||
c R, k(ijk DSE ij R fijk DSE ij R m)r DAR, kwhere DSE ij R f=DSEforrace,R=Blackornon-Black,female post-strata ij.DSE ij R m=DSEforrace,R=Blackornon-Black,malepost-strata ij.r DAR, k=DAsexratioforrace,R=Blackornon-Black,foragegroup kasgiveninAttachment3.Thesumoverthe ijpost-strataincludesonlytheintersec-tionofthosepost-stratawithagegroup k.DSEsAdjustedforCorrelationBiasAcorrelationbias-adjustedDSEforamale,18+post-stratum ijinage-racegroup kiscalculatedas: | |||
DS~E ij mc k DSE ij mForallremainingpost-strata,whichincludesfemalepost-strataaswellaspost-strataforpersonsunder18yearsofage,nocorrelationbiasadjustmentisdone.Thus: | |||
DS~E ij fDSE ij f The DS~E ijsarethenusedtoformsyntheticestimates.SYNTHETICESTIMATIONThecoveragecorrectionfactorsfordetailedpost-strata ijarecalculatedas: | |||
CC~F ijDS~E ij C en ijwherethe DS~E ijarethecorrelationbias-adjustedDSEsfor post-stratum ij.Cen ijsarethecensuscountsforpost-stratum ij.Notethat this Cen ijincludeslatecensusadds.Acoveragecorrectionfactorwasassignedtoeachcensusperson,exceptthoseingroupquartersorRemoteAlaska. | |||
Effectively,thesepersonshaveacoveragecorrectionfac-torof1.0.Indealingwithduplicatelinkstogroupquar-terspersons,thepersoninthegroupquarterwastreated asthecorrectenumeration,orthatthiswastheircorrectresidenceonCensusDay.Asyntheticestimateforanyareaorpopulationsubgroup bisgivenby: | |||
N~bC en b , ij CC~F ij ijb6-14SectionIIChapter6A.C.E.RevisionIIEstimationU.S.CensusBureau,Census2000 Notethatthecoveragecorrectionfactorcanbeexpressed as: CC~F ij(DD ij C en ij)(r CE, i r M, j)c kwhere r CE, iisthecorrectenumerationratecomponentoftheDSE,varyingover i post-strata. | |||
r M, jisthematchratecomponentoftheDSE,varying over j post-strata. | |||
c kisthecorrelationbiasadjustmentfactor,varyingovertheBlackandnon-Blackgroupsand k age cells.DD ijC en ijisthedata-definedrate,varyingoverthe ij post-strata.SectionIIChapter66-15A.C.E.RevisionIIEstimationU.S.CensusBureau,Census2000 Attachment1.RulesforAssigning z t&h tforFullP-andE-SampleDuplicateLinksTheLinkedSituationsandassignmentof z tsand h tsoccurintheorderlistedbelow.Linkedsituation(EorP)(Census)OriginalEcoding z t OriginalPcoding h t1.(Personinahousingunit)(Personinagroupquarters)EE0NonRes0CE/UE0Res/UE02a.(Person18+,childofreferenceperson)(Person18+,notchildofreferenceperson)EE0NonRes0CE/UE0Res/UE02b.(Person18+,notchildofreferenceperson)(Person18+,childofreferenceperson)EE0NonRes0CE/UE1Res/UE13.(Allpersonsinahousingunit)(Allpersonsinanotherhousingunit)EE0NonRes0 CE/UE z1 Res/UE z14.(Child0-17)(Child0-17)EE0NonRes0 CE/UE z2 Res/UE z25.AllremaininglinkedsituationsEE0NonRes0 CE/UE z3 Res/UE z3EEiserroneousenumeration.CEiscorrectenumeration. | |||
UEisunresolved. | |||
ResisresidentonCensusDay.NonResisnotaresidentonCensusDay.6-16SectionIIChapter6A.C.E.RevisionIIEstimationU.S.CensusBureau,Census2000 Attachment2.ControlCellsforLinkedESampleRace/HispanicOriginDomainTenureLinkedsituationControlcellDomain4(Non-HispanicBlack) | |||
Owner 3.4.5.Nonowner 3.4. | |||
5.Domain3(Hispanic) | |||
Owner 3.4.5.Nonowner 3.4. | |||
5.Domain7(Non-HispanicWhiteorSomeOtherRace)Domain5(NativeHawaiianorPacificIslander) | |||
Domain6(Non-HispanicAsian) | |||
Domain1(AmericanIndianorAlaskaNativeOnReservation) | |||
Domain2(AmericanIndianorAlaskaNativeOffReservation) | |||
Owner 3.4. | |||
5.Nonowner 3.4.5.SectionIIChapter66-17A.C.E.RevisionIIEstimationU.S.CensusBureauCensus2000 Attachment3.CorrelationBiasAdjustmentGroupingsandFactorsRace/HispanicOriginDomainAgeDAsex ratios Adjustment factor Black:Domain4(Non-HispanicBlack)18-290.901.0830-490.891.1050+0.761.05 Non-Black:Domain3(Hispanic)Domain7(Non-HispanicWhiteorSomeOtherRace)Domain5(NativeHawaiianorPacificIslander) | |||
Domain6(Non-HispanicAsian) | |||
Domain1(AmericanIndianorAlaskaNativeOnReservation) | |||
Domain2(AmericanIndianorAlaskaNativeOffReservation)18-291.041.00*30-491.011.0250+0.861.01*Thisnumbersetto1.00duetotheinconsistencybetweenDAandA.C.E.RevisionIIresults.6-18SectionIIChapter6A.C.E.RevisionIIEstimationU.S.CensusBureauCensus2000 Chapter7.AssessingtheEstimates INTRODUCTIONTheevaluationsoftheA.C.E.RevisionIIestimatesmaybedividedintotwocategories.Onecategorycontainsthe evaluationsthatfocusonindividualerrorcomponents.TheothergroupconsistsofcomparisonsoftherelativeerrorbetweenthecensusandtheA.C.E.RevisionIIestima-tor.Thischapterprovidesabriefdescriptionoftheevaluationstudies.Thecomponenterrorsexaminedbyseparatestudiesaresamplingerror,errorfromimputationmodelselection,errorduetousinginmoverstoestimateout-moversinPES-C,syntheticerror,errorintheidentification ofthecensusduplicatesasdeterminedbyadministrativerecords,errorintheidentificationofcomputerduplicatesasdeterminedbyaclericalreview,errorfrominconsistent post-stratificationvariables,andpotentialerrorarisingfromtheautomatedcodingofsomecases,calledtheat-riskcoding,intheRevisionSample.Thecomparisonsof relativeerrorbetweenthecensusandtheA.C.E.RevisionIIestimatorincludeacomparisonwithDemographicAnalysis,theconstructionofconfidenceintervalsthat accountforbiasaswellasrandomerror,andlossfunction analyses.AlsointhiscategoryisanexaminationoftheconsistencyoftheestimatesofcoverageerrormeasuredbytheA.C.E.RevisionIIestimatorandtheHousingUnit CoverageStudy(HUCS).Althoughanadjustmentforcorre-lationbiasisincludedintheA.C.E.RevisionIIestimates,noevaluationsaddresstheerrorinthelevelofcorrelation biasorthemodelusedtodistributeitacrosspost-strata.Thereasonisthatexaminingalternativemodelsonlyaccountsfordifferencesinmodels.Thosedifferences wouldreflectthevariationsinhowtheseveralmodelscor-recttheoriginalDSEsforcorrelationbiases,butwouldnotreflectthepresenceorabsenceofcorrelationbiasinthecorrectedDSEs.SAMPLINGERRORSamplingerrorgivesrisetorandomerror,whichisquanti-fiedbysamplingvariance.Thesamplingvarianceispresentinanyestimatebasedonasampleinsteadofthewholepopulation.Thevarianceestimationmethodologyis asimplifiedjackknifewiththeblockclustersbeingthepri-marysamplingunit.Theeffectofwithin-clustersubsam-plingisimplicitlycapturedintheweighting.TheMarch2001A.C.E.datashowedthatthesimplifiedjackknifemethodproducessatisfactoryvarianceesti-mates.SinceacorrelationbiasadjustmentwasincludedintheA.C.E.RevisionIIestimates,theadjustmentforcorrela-tionbiaswasrecalculatedforeachreplicate.Analterna-tivevarianceestimationprocedureassumedthattheform ofthecorrelationbiasadjustmentwasascalartimesthedouble-samplingestimator.ThereplicationmethodalsoaccountsfortheA.C.E.blockclustersampling.SYNTHETICERROREVALUATIONTheA.C.E.RevisionIIhasseveralpotentialsourcesofsyn-theticerror.Onesourceinvolvescorrectingtheindividual post-stratumestimatesforerrorestimatesatmoreaggre-gatelevels,suchascorrectionsforcorrelationbiasandmeasurementcodingerrors.However,theevaluationof syntheticerrorfocusesonerrorinsmallareaestimation.Syntheticestimationbiasariseswhenareasinapost-stratumhavedifferentcoverageerrorrates,buthavethe samecensuscoveragecorrectionfactor.Toassesssyn-theticestimationbiasforagivenarea,anestimatebasedondatafromtheareaalone,calledadirectestimate,must bedeveloped.Suchanestimateispossibleforonlylarge areas.Inlieuofdirectestimates,syntheticestimationbiasinundercountestimatesisestimatedfromanalysisofartificialpopulationsorsurrogatevariableswhosegeo-graphicdistributionsareknown.Thesesurrogatevariablesareconstructedasbestaspossibletohavepatternssimi-lartocoverageerror.Sensitivityanalysesassessthe impactofsyntheticestimationbiasforthesevariables.Theevaluationofsyntheticerrorwithinpost-stratausesanartificialpopulationanalysissimilartothoseconducted forESCAPIandESCAPII.ThesestudiesaredocumentedinGriffinandMalec(2001,2001b).Thistime,however,theevaluationcomparestheA.C.E.RevisionIIestimatesand Census2000.Thestudyuseslossfunctionsforassessing theeffectofsyntheticerror.Themajorproductsare:*EstimatesofthebiasinthedifferencebetweencensuslossandA.C.E.RevisionIIestimatorloss.*IndicatorofwhetherthedecisiontousetheA.C.E.RevisionIIestimatorwouldhavechangedduetosyntheticerror.ERRORDUETOUSINGINMOVERSTOESTIMATEOUTMOVERSINPES-CTheerrorduetousinginmoverstoestimateoutmoversisuniquetothePES-CmodelfordualsystemestimationusedintheoriginalA.C.E.andtheA.C.E.RevisionII.For thePES-Cmodel,themembersofthePsamplearetheSectionIIChapter77-1AssessingtheEstimatesU.S.CensusBureau,Census2000 residentsofthehousingunitsonCensusDay.Thereissomedifficultyinidentifyingalltheresidentsofallthe housingunitsonCensusDaybecausesomemovepriorto theA.C.E.interview.TheA.C.E.interviewreliesonthe respondentstoidentifythosewhohavemovedout,the outmovers.Sincetheoutmoversareidentifiedbyproxies, manyoftheoutmoversarenotrecorded.Therefore,the estimateofoutmoversistoolow.Toavoidabiascaused byanunderestimateofthenumberofmovers,PES-Cuses thenumberofinmoverstoestimatethenumberofout-movers.Theinmoversarethosewhodidnotliveinthe sampleblocksonCensusDay,butmovedinpriortothe A.C.E.interview.Theoretically,thenumberofinmoversin thewholecountryshouldequalthenumberofoutmovers. | |||
However,thenumberofinmoversmaynotequalthenum-berofoutmoversinapost-stratumbecauseofcircum-stancessuchaseconomicconditionscausingmorepeople tomoveoutofanareathantomoveintoanarea.Thefirststepofthemethodologyconsistsofrakingthenumberofoutmoverstototalinmovers.Thedistributionoftherakedoutmoversmaybetterdescribetheoutmov-ersthanthedistributionoftheinmovers.TheA.C.E.Revi-sionIIestimatesformedbyusingthenumberofinmoversarecomparedwiththeA.C.E.RevisionIIestimatescalcu-latedusingtherakednumber.ERRORFROMIMPUTATIONMODELSELECTIONThisprojectestimatestheuncertaintyduetochoiceofimputationmodelbydrawingontheanalysisofreason-ablealternativestotheimputationmodelconductedin 2001.SeeKeathleyetal.(2001)fordetails.Theidealapproachwouldbetorepeattheverytime-consuminganalysisofreasonablealternativesfortheA.C.E.Revision IIestimator.However,thisanalysiswasnotconducteddue tolimitedresources.Instead,anestimateoftheadditionalvarianceduetothechoiceofimputationmodelisdevel-opedusingthepreviousA.C.E.work.Estimatesofthevariancecomponentforcensuscoveragecorrectionfactorsthataccountforthemissingdataerrorcomponentduetotheimputationofenumerationstatus,residencystatus,matchstatus,andtheP-samplenoninter-viewadjustmentareformed.Thereplicatesusedto estimatethemissingdatavarianceareusedinthelossfunctionanalysistorepresenttherandomerrorduetothechoiceofthemodelsimputationformissingdata.EXAMININGTHEQUALITYOFTHECOMPUTERDUPLICATESWITHADMINISTRATIVERECORDSAdministrativerecordsprovideanopportunitytoexaminethequalityoftheestimatesofduplicateenumerationsusedintheA.C.E.RevisionIIestimates.ThisstudyusestheStatisticalAdministrativeRecordsSystem(StARS)2000 (Leggierietal.,2002;Judson,2000)toassesstheeffec-tivenessoftheautomatedmethodologyusedintheFur-therStudyofPersonDuplication(FSPD)toidentifydupli-cateenumerations.Secondarygoalsaretoprovidedatathatcanbeanalyzedtodeterminethenatureofthecen-susduplication,sothattheinformationmaybeusedin reducingcensusduplicationin2010andtoaidinthe evaluationofthemethodologyfortheconstructionof StARS2000.Thestudyproducesacomparisonoftheesti-matedamountofcensusduplicationbasedonadministra-tiverecordswiththeestimatefromFSPD.StARSisnewmethodologythatcompilessevenadminis-trativerecordsfiles,includingfilesfromIRS,Medicare, HUD,andSelectiveService 1.Theevaluationusesaprevi-ousmatchbetweenthecensusandStARS2000toassignanIdentification(ID)Numbertoasmanycensusrecords aspossible.TheprocessofassigningIDNumberswasbasedonnameandaddress.OnepassthroughthecensusfilesusedboththeaddressandthenametoassignID Numbers.Asecondpassusedonlythenameandbirthdate.AcensusrecordwasassignedanIDNumberonlyifitwaslinkedwithexactlyoneIDNumber.CensusenumerationswiththesameIDNumberarecon-sideredduplicates.ThemethodaccountsforcoincidentalagreementofnamesbyrequiringassignmentofIDNum-bersonlywhenexactlyoneIDNumberwaslinkedtotheenumeration.Inmostcases,twopeoplewithverysimilarnamesandcharacteristicswouldhavelinkedtoeachoth-ersIDNumberandwouldnothavebeenassigneda uniqueIDNumber.CLERICALREVIEWOFCOMPUTERDUPLICATESThestudyexaminesaccuracyoftheFSPDcomputeridenti-ficationofduplicationinthecensusbyhavingclerksreviewtheenumerationsthatthecomputerdesignatesas duplicates.Theclerksdeterminewhetherthesetsoftwo enumerationsappeartobethesamepersons.Inaddition,censusenumerationsidentifiedasduplicatesbyadminis-trativerecords,butnotbythecomputer,alsohaveacleri-calreview.Thepotentialcensusduplicatesidentifiedbyadministrativerecordsareaby-productoftheevaluationofthecomputerduplicatesusingadministrativerecords.Thereviewisrestrictedtoduplicatesbetweenenumera-tionsintheEsampleintheA.C.E.blocksandcensusenu-merationsoutsidethesearcharea.Linksbetween P-samplenonmatchesandenumerationsoutsidethesearchareaalsoarereviewed.Theclericalreviewproducesthefollowing:*NumberofE-sampleenumerationswithfalseduplicatelinksidentifiedbythecomputer. | |||
1TheCensusBureauobtainsadministrativedataforitsStARSdatabaseasauthorizedbyTitle13U.S.C.,section6andsup-portedbyprovisionsofthePrivacyActof1974.UnderTitle13, theCensusBureauisrequiredtoprotecttheconfidentialityofalltheinformationitreceivesdirectlyfromrespondentsorindirectlyfromadministrativeagenciesandispermittedonlytousethat informationforstatisticalpurposes.7-2SectionIIChapter7AssessingtheEstimatesU.S.CensusBureau,Census2000 | |||
*NumberofE-sampleenumerationswithmisseddupli-catesidentifiedbyadministrativerecordsthatarecor-rect.*NumberofP-samplenonmatcheswithfalseduplicatelinksidentifiedbythecomputer.*NumberofP-samplenonmatcheswithmisseddupli-catesidentifiedbyadministrativerecordsthatarecor-rect.Withtheseresults,theaccuracyrateforthecomputeridentificationofduplicatesinthecensusandbetweenthe P-samplenonmatchesandthecensuscanbecomputed.AT-RISKCODINGThestudyassessestheamountoferroratriskduetonothavingeachandeverycaseintheEvaluationFollow-up(EFU)samplereviewedclerically(AdamsandKrejsa,2002). | |||
ThedatacollectedintheEvaluationFollow-upoftheA.C.E.founderrorsinthecodingofE-samplecensusenu-merationstatusandP-Sampleresidenceandmatchstatus thatneededtobecorrectedfortheA.C.E.RevisionIIesti-mator.Ideally,thiswouldmeanrecodingtheentireA.C.E.sample,butthatwasnotpossiblebecausetheEvaluation Follow-upcollecteddatainonly2,259outofthe11,303 A.C.E.sampleclusters.Evenclericallyrecodingthe70,000casesintheEvaluationFollow-upsamplewasnotfeasiblebecauseoftimeconstraints.Anewstrategywasdevised toprovidethemosthighqualitydatainthetimeallowedbyrestrictingtheclericalreviewtothemoredifficultcases.Thisstrategyreducedtheclericalworkloadto about25,000,whichcouldbedone,andensuredthelarg-estsamplepossiblefortheA.C.E.RevisionIIestimates.SincethePersonFollow-up(PFU)andtheEvaluationFollow-up(EFU)questionnaireshadbeenkeyedandwere availableinelectronicform,datawerecombinedusinganalgorithmbasedonthekeyeddataandaclericalcodingofthecategoriesofcaseswherethecomputerdidnot appeartodoagoodjob.ThemethodcomparesthecodeassignedbasedonthePFUquestionnairetothecodeassignedbasedontheEFUquestionnaire,andthen,determinesthebestcode.The effectivenessofthecomputeralgorithmisassessedbythe agreementbetweenthetwonewcodes,andacomparisonwithrecodesassignedinthefallof2001toasubsampleoftheEFUEsamplecalledthePersonFollow-up/ | |||
EvaluationFollow-up(PFU/EFU)Review.ThePFU/EFUReviewisbelievedtohavebeenthebestA.C.E.coding operation.ForthePsampleintheEvaluationFollow-up,acodingalgorithmforthekeyeddatafromthePFUandEFUques-tionnairesalsowasdeveloped.Assessingthequalitywasnotaseasyforthenonmatchesandunresolvedcasesasforthematches.AlthoughrecodesfromthePFU/EFUReviewwereavailableforthematchesinthePsample, noneofthenonmatchesorunresolvedcaseswere | |||
included.ThecategoriesofcasesnotsentforclericalreviewhadahighagreementratebetweenthePFUandEFUcodesassignedbythecomputeralgorithm.Forthecasesin thesecategorieswherethePFUandEFUdisagreed,theselectedcodecamefromtheformwithmoredetailedinformation.Therefore,therearethreetypesofcasesin theestimation:1.ThePFUandEFUcodesassignedbycomputeragree.2.ThePFUandEFUcodesassignedbycomputerdis-agree,butareinacategorywherethereishighcon-sistencybetweenthePFUandEFUcodes,andeitherthePFUformortheEFUformdoesnothaveanswerstoallthequestions.Thecodefortheformwithcom-pletedataisselected.3.Clericallyassignedcodes.Thefirstgroupiscalledtheat-riskcases.Thesecasesmayhaveahigherriskoferrorthantheothersbecause thelackofclericalreview,eventhoughthecodesassignedbythecomputeralgorithmagree.However,casesinthesecondgroupmayalsohaveerror,althoughtheyareina categorywithhighconsistencybetweenthePFUandEFU.Forthesecases,thereisnowaytoassesstheriskoferrorduetothelackofinformationononeoftheforms.Toassessthepotentialforerror,theat-riskcasesareassumedtohavethesameerrorrateascasesintheircat-egoryinthePFU/EFUReview.Thepotentialimpactis assessedbycomparingtheA.C.E.RevisionIIdouble-samplingadjustmentfactorswiththedouble-samplingratiosundertheassumptionthatincorporatestheerrorrates.Thedouble-samplingadjustmentfactorsare describedinChapter6.INCONSISTENCYOFPOST-STRATIFICATIONVARIABLESInconsistencyintheE-andP-samplereportingofthechar-acteristicsusedindefiningthepost-stratamaycreateabiasinthedualsystemestimate(DSE).ThisbiasaffectstheestimationoftheP-samplematchrate.Theanalysisofthepost-stratificationvariablesfortheA.C.E.RevisionIIestimatorwassimilartotheinvestigationdonefortheoriginalA.C.E.Thebasicapproachwasto estimatetheinconsistencyinthepost-stratificationvari-ablesusingthematches,thenassumethattheratesalsoheldforthenonmatches.Themodelsusedfortheincon-sistencyanalysisoftheoriginalA.C.E.post-strata, describedinHabermanandSpencer(2001),werefittedintwosteps:(1)modelsforinconsistencyofbasicvariables,SectionIIChapter77-3AssessingtheEstimatesU.S.CensusBureau,Census2000 and(2)derivationofinconsistencyprobabilitiesforpost-stratificationgiventheinconsistencyprobabilitiesofthe basicvariables.Theinconsistencyprobabilitiesledtoan estimateofthebiasintheP-samplematchratethatwas usedtoestimatethebiasintheDSE.Theapproachtaken fortheA.C.E.RevisionIIestimatoristore-calculatethe modelsin(1)and(2)toreflectrevisionsintheP-sample post-stratificationandrepeattheanalysis.Toassessthebiasduetoinconsistencyinthepost-stratificationvariables,theA.C.E.RevisionIIestimatesarecalculatedwithacorrectiontothematchrateforthe inconsistency.Estimateswithandwithoutthecorrectionarethencompared.CONSISTENCYBETWEENTHEA.C.E.REVISIONIIESTIMATORANDHUCSThestudyexaminesthevalidityoftheA.C.E.RevisionIIestimatesbyassessingtheconsistencyintheresultsfrom theA.C.E.RevisionIIestimatesandtheHousingUnitCov-erageStudy(HUCS)describedinBarrettetal.(2001).SincetheA.C.E.RevisionIIestimatescouldhavebeenusedin thepost-censalestimatesprogramthatutilizestheaver-agehouseholdsizeinmanycalculations,itisimportanttoconsidertheconsistencybetweentheA.C.E.RevisionII estimatesandtheHUCSdata.A.C.E.RevisionIIestimatescensuscoverageforpeopleandHUCSestimatescensuscoverageforhousingunits. | |||
Patternsinthedifferentialcoveragefordemographicand geographicgroupswereexamined.Similarpatternsinthemeasuresofchangeincensuscoveragebetween1990and2000fordemographicandgeographicgroupsare expected.Ifthereisasubstantialdifferenceinthecensuscoverageerrorcausedbymissingwholehouseholdsandbymissingpeoplewithinhouseholds,thepatternsofdif-ferentialcoverageofpeopleandofhousingunitsmaynothavesimilarpatterns.IftherearedemographicorgeographicgroupswherethedifferentialcoveragefromtheA.C.E.RevisionIIestimatorandHUCSissubstantiallydifferent,thestudyattemptstodescribewhetherthedisagreementisasymptomofprob-lemswiththeA.C.E.RevisionIIestimatororHUCS,orthe resultoflegitimatedifferencesincoverage.RELATIVEACCURACYOFTHECENSUSANDA.C.E.REVISIONIIESTIMATORUSINGDEMOGRAPHICANALYSISDemographicAnalysis(DA)usesvitalrecords,immigrationstatistics,andMedicaredatatoobtainanestimateofthe populationsize.Sincethemethodsaresomewhatinde-pendentofthecensus,DAprovidesamethodforassess-ingtherelativequalityofthecensusandtheA.C.E.Revi-sionII.TheconsistencyofestimatesofdifferentialcensuscoveragefromtheA.C.E.RevisionIIestimatorandDAareassessedfordemographicgroups.Estimatesofdifferentialcensuscoveragearecomparedbydemographiccharacteristics,includingrace,sex,andage. | |||
TheestimatesofpopulationsizebasedonDAarenot viewedwithasmuchconfidenceastheestimatesofdiffer-entialcoverage.DAdoesabetterjobofmeasuringdiffer-encesincoveragebetweengroupsthanpopulationsize.Inaddition,sexratiosfromtheA.C.E.RevisionIIestimatesandDAarecompared.Thesexratioistheratioofmales tofemalesandprovidesameasureofdifferentialcoverage ofmalesandfemales,especiallywhencalculatedforrace groups.Thesecomparisonsarerepeatedwith1990Post-EnumerationSurveyandDAestimatestoprovideacon-textforviewingthecomparisonswiththe2000data.Anassessmentisconductedtodeterminewhetherbothmeth-odsmeasurethesamechangeindifferentialnetunder-countsfrom1990to2000.RELATIVEACCURACYOFTHECENSUSANDTHEA.C.E.REVISIONIIESTIMATORUSINGCONFIDENCEINTERVALSANDLOSSFUNCTIONANALYSISTwoadditionalmethodsofassessingtherelativeaccuracyofthecensusandtheA.C.E.RevisionIIestimatesare usingconfidenceintervalsforthenetundercountrateand alossfunctionanalysis.Confidenceintervalsfornetundercountratesareformedusingestimatesofnetbiasandvariance.Sincemostofthedataavailableonthequal-ityoftheoriginalA.C.E.isbeingincorporatedintheA.C.E.RevisionIIestimates,theestimationofthenetbiasusesthedatathatwerenotincluded.Inthelossfunctionanaly-sis,themeansquarederrorweightedbythereciprocalofthecensuscountisusedtoestimatelossforlevelsandsharesforcountiesandplacesacrossthenationand withinstate.Confidenceintervalsthatincorporatethenetbiasaswellasthevarianceforthenetundercountrateprovidea methodforcomparingtherelativeaccuracyofthecensusandtheA.C.E.RevisionIIestimates.Thenetbiasinthecensuscoveragecorrectionfactorisestimatedforeach post-stratum.Withtheestimatedbiasandvarianceforeachcensuscoveragecorrectionfactor,thebiasB (U)and variance Vinthenetundercountrateareestimated.Also,95percentconfidenceintervalsforthenetunder-countrateareconstructedby (UBU2V,UBU2V).Since=0correspondstonoadjustmentofthecensus,onecomparisonoftherelativeaccuracyofthecensusandtheA.C.E.RevisionIIestimatesisbasedonanassessmentofwhethertheconfidenceintervalsfortheevaluation post-stratacover0and.7-4SectionIIChapter7AssessingtheEstimatesU.S.CensusBureau,Census2000 AlossfunctionanalysisforlevelsandsharescomparesthecensusandtheA.C.E.RevisionIIestimatorforcoun-tiesandplacesacrossthenationandwithinstate.The measureofaccuracyusedbythelossfunctionsisthe weightedmeansquarederrorwiththeweightssettothe reciprocalofthecensuscountforlevelsandthereciprocal ofcensusshareforshares.Themotivationfortheselected groupingsforthelossfunctionsistheirpotentialusein thepost-censalestimates.Thesegroupingsare:*Levels*Allcountieswithpopulationof100,000orless*Allcountieswithpopulationgreaterthan100,000 | |||
*Allplaceswithpopulationatleast25,000butlessthan50,000*Allplaceswithpopulationatleast50,000butlessthan100,000*Allplaceswithpopulationgreaterthan100,000*Shareswithinstate*Allcounties*Allplaces*ShareswithinU.S.*Allplaceswithpopulationatleast25,000butlessthan50,000*Allplaceswithpopulationatleast50,000butlessthan100,000*Allplaceswithpopulationgreaterthan100,000 | |||
*AllstatesSectionIIChapter77-5AssessingtheEstimatesU.S.CensusBureau,Census2000 SectionII. | |||
ReferencesAdams,T.andKrejsa,E.(2001).ESCAPII:ResultsofthePersonFollowupandEvaluationFollowupFormsReview,ExecutiveSteeringCommitteeforA.C.E.PolicyII,Report 24.Adams,T.andKrejsa,E.(2002).A.C.E.RevisionIIMea-surementSubgroupDocumentation,DSSDA.C.E.Revision IIMemorandumSeries#PP-6.Adams,T.andLiu,X.(2001).ESCAPII:EvaluationofLackofBalanceandGeographicErrorsAffectingPersonEsti-mates,ExecutiveSteeringCommitteeforA.C.E.PolicyII,Report2.Barrett,D.,Beaghen,M.,Smith,D.,andBurcham,J.(2001).ESCAPII:Census2000HousingUnitCoverageStudy, ExecutiveSteeringCommitteeforA.C.E.PolicyII,Report 17.Bean,S.(2001).ESCAPII:AccuracyandCoverageEvaluationMatchingError,ExecutiveSteeringCommitteeforA.C.E.PolicyII,Report7.Cantwell,P.andChilders,D.(2001).AccuracyandCoverageEvaluationSurvey:AChangetotheImputationCellstoAddressUnresolvedResidentandEnumeration Status,DSSDCensus2000ProceduresandOperationsMemorandumSeries,#Q-44.Childers,D.(2001).AccuracyandCoverageEvaluation:TheDesignDocument,DSSDCensus2000ProceduresandOperationsMemorandumSeries,ChapterS-DT-1, Revised.Davis,P.(2001).AccuracyandCoverageEvaluation:DualSystemEstimationResults,DSSDCensus2000 ProceduresandOperationsMemorandumSeries#B-9*.ESCAPI(2001).ReportoftheExecutiveSteeringCommitteeforAccuracyandCoverageEvaluationPolicy,March1,2001.(Seewww.census.gov/dmd/www/ | |||
pdf/Escap2.pdf)ESCAPII(2001).ReportoftheExecutiveSteeringCommitteeforAccuracyandCoverageEvaluationPolicyonAdjustmentforNon-RedistrictingUses,October17,2001.(Seewww.census.gov/dmd/www/ | |||
pdf/Recommend2.pdf)Fay,R.(2001).ESCAPII:EvidenceofAdditionalErroneousEnumerationsfromthePersonDuplicationStudy,ExecutiveSteeringCommitteeforA.C.E.PolicyII, Report9,PreliminaryVersion,October26,2001.Fay,R.(2002).ESCAPII:EvidenceofAdditionalErroneousEnumerationsfromthePersonDuplicationStudy, ExecutiveSteeringCommitteeforA.C.E.PolicyII,Report9,RevisedVersion,March27,2002.Fay,R.(2002b).ProbabilisticModelsforDetectingCensusPersonDuplication,ProceedingsoftheSurveyResearchMethodsSection,AmericanStatisticalAssociation.Feldpausch,R.(2001).ESCAPII:CensusPersonDuplicationandtheCorrespondingA.C.E.Enumeration Status,ExecutiveSteeringCommitteeforA.C.E.PolicyII,Report6.Griffin,R.andMalec,D.(2001).AccuracyandCoverageEvaluation:AssessmentofSyntheticAssumption,DSSDCensus2000ProceduresandOperationsMemorandum Series,B-14*.Griffin,RandMalec,D.(2001b).ESCAPII:SensitivityAnalysisfortheAssessmentoftheSyntheticAssumption,ExecutiveSteeringCommitteeforA.C.E.PolicyII,Report23.Haberman,S.andSpencer,B.(2001).EstimationofInconsistentPost-stratificationinthe2000A.C.E.Paper preparedbyAbtAssociatesInc.andSpencerStatistics,Inc. | |||
underTaskNumber46-YABC-7-00001,ContractNumber50-YABC-7-66020.Haines,D.(2001).AccuracyandCoverageEvaluationSurvey:ComputerSpecificationsforPersonDualSystemEstimation(U.S.)-Re-issueofQ-37,DSSDCensus2000 ProceduresandOperationsMemorandumSeries#Q-48.Hogan,H.(1993).The1990Post-EnumerationSurvey:OperationsandResults,JournaloftheAmericanStatisticalAssociation,88,1047-1060.Hogan,H.(2002).FiveChallengesinPreparingImprovedPostCensalPopulationEstimates,DSSDA.C.E.RevisionIIMemorandumSeries#PP-1.Hogan,H.,Kostanich,D.,Whitford,D.,andSingh,R.(2002).ResearchFindingsoftheAccuracyandCoverageEvaluationandCensus2000Accuracy,ProceedingsoftheSurveyResearchMethodsSection,AmericanStatistical | |||
Association.Ikeda,M.(2001).AccuracyandCoverageEvaluationSurvey:SomeNotesRelatedtoAccuracyandCoverageEvaluationMissingDataProcedures,DSSDCensus2000 ProceduresandOperationsMemorandumSeries#Q-77.SectionIIReferences1ReferencesU.S.CensusBureau,Census2000 Ikeda,M.andMcGrath,D.(2001).AccuracyandCoverageEvaluationSurvey:SpecificationsfortheMissingData Procedures;RevisionofQ-25,DSSDCensus2000 ProceduresandOperationsMemorandumSeries#Q-62.Judson,D.(2000).TheStatisticalAdministrativeRecordsSystem:SystemDesignandChallenges,PaperpresentedattheNISS/TelcordiaDataQualityConference,November, 2000.Keathley,D.,Kearney,A.,andBell,W.(2001).ESCAPII:AnalysisofMissingDataAlternativesfortheAccuracyandCoverageEvaluation,ExecutiveSteeringCommitteefor A.C.E.PolicyII,Report12.Kostanich,D.(2003).A.C.E.RevisionII:SummaryofMethodology,DSSDA.C.E.RevisionIIMemorandumSeries | |||
#PP-35.Krejsa,E.andAdams,T.(2002).ResultsoftheA.C.E.RevisionIIMeasurementCoding,DSSDA.C.E.RevisionII MemorandumSeries#PP-55.Krejsa,E.andRaglin,D.(2001).ESCAPII:EvaluationResultsforChangesinA.C.E.EnumerationStatus, ExecutiveSteeringCommitteeforA.C.E.PolicyII,Report3.Leggieri,C.,Pistiner,A.,andFarber,J.(2002).MethodsforConductinganAdministrativeRecordsExperimentin Census2000,ProceedingsoftheSurveyResearchMeth-odsSection,AmericanStatisticalAssociation.Mule,T.(2001).ESCAPII:PersonDuplicationinCensus2000,ExecutiveSteeringCommitteeforA.C.E.PolicyII,Report20.Mule,T.(2001b).AccuracyandCoverageEvaluation:DecompositionofDualSystemEstimateComponents,DSSDCensus2000ProceduresandOperationsMemorandumSeries#B-8*.Mule,T.(2002).RevisedPreliminaryEstimatesofNetUndercountsforSevenRace/EthnicityGroupings,DSSDA.C.E.RevisionIIMemorandumSeries#PP-2.Mule,T.(2002b).FurtherStudyofPersonDuplicationStatisticalMatchingandModelingMethodology,DSSD A.C.E.RevisionIIMemorandumSeries#PP-51.Mulry,M.andPetroni,R.(2002).ErrorProfileforPES-CasImplementedinthe2000A.C.E.,ProceedingsoftheSurveyResearchMethodsSection,AmericanStatistical | |||
Association.Nash,F.(2000).OverviewoftheDuplicateHousingUnitOperations,Census2000InformationMemorandumNumber78.Raglin,D.andKrejsa,E.(2001).ESCAPII:EvaluationResultsforChangesinMoverandResidenceStatusintheA.C.E.,ExecutiveSteeringCommitteeforA.C.E.PolicyII,Report16.Robinson,J.G.(2001).ESCAPII:DemographicAnalysisResults,ExecutiveSteeringCommitteeforA.C.E.PolicyII, Report1.Thompson,J.,Waite,P.,Fay,R.(2001).BasisofRevisedEarlyApproximationofUndercountsReleasedOctober17, 2001,ExecutiveSteeringCommitteeforA.C.E.PolicyII,Report9a.U.S.CensusBureau(2003).TechnicalAssessmentofA.C.E.RevisionII,March12,2003.(See www.census.gov/dmd/www/pdf/ACETechAssess.pdf)Winkler,W.(1995).MatchingandRecordLinkage,BusinessSurveyMethods,ed.B.G.Coxetal.(NewYork:J.WileyandSons),355-384.Winkler,W.(1999).DocumentationforRecordLinkageSoftware,U.S.CensusBureau,StatisticalResearch Division.Yancey,W.(2002).BigMatch:AProgramforExtractingProbableMatchesfromaLargeFileforRecordLinkage,U.S.CensusBureau,StatisticalResearchDivision.2SectionIIReferencesReferencesU.S.CensusBureau,Census2000}} |
Revision as of 21:26, 29 July 2018
ML12088A236 | |
Person / Time | |
---|---|
Site: | Indian Point |
Issue date: | 03/28/2012 |
From: | Entergy Nuclear Operations, US Dept of Commerce, Bureau of Census |
To: | Atomic Safety and Licensing Board Panel |
SECY RAS | |
Shared Package | |
ML12088A225 | List: |
References | |
RAS 22094, 50-247-LR, 50-286-LR, ASLBP 07-858-03-LR-BD01 | |
Download: ML12088A236 (202) | |
Text
ENT000017 Submitted: March 28, 2012 ThistechnicaldocumentwaspreparedunderthedirectionofDonnaKostanich,AssistantDivisionChiefforSamplingand Estimation,DecennialStatisticalStudies Division.Theoverallmanagementand coordinationofthereviewwasconducted
byDawnHaines andDouglasOlson
.ThecombinedeffortsofnumerousU.S.CensusBureaustaffhaveculminatedin thepublicationofthisdocument.Some staffmemberswrotechapters,while othersreviewedchapters.Insomecases,staffmembersfilledbothcapacities.ContributingtotheMarch2001portionofAccuracyandCoverageEvaluationof Census2000:DesignandMethodology werePatrickCantwell,InezChen,DannyChilders,PeterDavis,JamesFarber,DeborahFenstermaker, RichardGriffin,DawnHaines, HowardHogan,MichaelIkeda, DonnaKostanich,VincentThomasMule,MaryMulry,AlfredoNavarro,DouglasOlson,J.Gregory Robinson,RobertSands, and MichaelStarsinic.JosephWaksberg,ofWestat,Inc.,reviewedthesechaptersforreadabil-ityandconsistency.ContributingtotheA.C.E.RevisionIIsec-tionofAccuracyandCoverageEvaluation ofCensus2000:DesignandMethodology wereTamaraAdams,MichaelBeaghen,WilliamBell,PatrickCantwell,DeborahFenstermaker,Richard Griffin,DawnHaines,MichaelIkeda,DonnaKostanich,ElizabethKrejsa,VincentThomasMule,MaryMulry, RitaPetroni,RobertSands,Eric Schindler,BruceSpencer,ofNorthwest-ernUniversity,andDavidWhitford.RhondaGeddingsprovidedadministra-tivesupport.BernadetteBeasley,MeshelButler,HelenCurtis,SusanKelly, and Kim OttensteinoftheAdministrativeandCustomerServicesDivision,Walter Odom,Chief,providedpublicationsandprintingmanagement,graphicsdesign, andcompositionandeditorialreviewforprintandelectronicmedia.Generaldirectionandproductionmanagement wereprovidedbyJamesClark, AssistantDivisionChief.MargaretSmithofACSDprovidedassis-tanceinplacingtheelectronicversionof thisdocumentontheInternet(see www.census.gov/dmd/www/refroom.html).Wearegratefulfortheassistanceoftheindividualslistedandallotherswho contributedbutarenotspecificallymentioned.Thepreparationandpublica-tionofthisdocumentwaspossible becauseoftheirinvaluablecontributions.
ACKNOWLEDGMENTS
Vacant,PrincipalAssociateDirector andChiefFinancialOfficerVacant,PrincipalAssociate DirectorforProgramsPrestonJayWaite,AssociateDirector forDecennialCensusNancyM.Gordon,AssociateDirector forDemographicProgramsSUGGESTEDCITATIONFILES:Census2000,AccuracyandCoverageEvaluationofCensus2000:DesignandMethodologyU.S.CensusBureau,2004CynthiaZ.F.Clark,AssociateDirector forMethodologyand StandardsMarvinD.Raines,AssociateDirector forFieldOperationsArnoldA.Jackson,AssistantDirector forDecennialCensus ECONOMICSAND STATISTICSADMINISTRATION EconomicsandStatistics AdministrationKathleenB.Cooper,UnderSecretaryforEconomicAffairsU.S.CENSUSBUREAUCharlesLouisKincannon,DirectorHermannHabermann,DeputyDirectorandChiefOperatingOfficer ForewordTheU.S.CensusBureauconductedtheAccuracyandCoverageEvaluation(A.C.E.)surveytomeasurethecoverageofthepopulationinCensus2000.TheA.C.E.wasdesignedtoservetwopurposes:(1)tomeasurethe netcoverageofthepopulation,bothintotalandformajorsubgroups,and(2)toprovidedatathatcouldserveasthebasisforcorrectingthecensuscountsforsuchusesas Congressionalredistricting,stateandlocalredistricting,fundsallocationandgovernmentalprogramadministra-tion.TheA.C.E.surveyprovidescriticalinformationthat canbeusedtoimprovethecensus-takingprocess.
However,thedesign,methodology,operationsanddatacollectioneffortsareextremelycomplexandnotwidelyunderstood.Theworkdescribedinthispublicationwasa majorundertaking,andthetechnicaldocumentationisintendedtoincreaseawarenessandknowledge,andsub-sequentlyimprovethe2010Censusandcoveragemea-surementtechniques.DespitethefactthatcoveragemeasurementtechniqueshavebeenutilizedbytheCensusBureauforseveral decades,thisisthefirstcomprehensivedocumentationofitskind.Thistechnicaldocumentdescribesthemethod-ologiesthatwereusedtoproduceestimatesofCensus 2000coverageerrorfromtheA.C.E.Thefirstpartofthis documentdiscussestheentiresurveydesignusedtoproducetheoriginalestimatesofnetundercountreleasedinMarch2001.Analysisandevaluationsindicatedthat therewereseriouserrorsintheMarch2001A.C.E.
Researcheffortstofixthedetectederrorsresultedin improvedcoverageestimatesreferredtoasA.C.E.Revi-sionII.ThesecondpartofthisdocumentdescribesthemethodologyusedtocorrectforerrorsintheMarch2001 A.C.E.Afterextensiveanalysisandconsideration,theCensusBureauultimatelydecidednottousetheA.C.E.-neither theMarch2001northeRevisionIIresults-tocorrectthe Census2000countsoranyotherdataproducts.A.C.E.
RevisionII,thesuperiorofthetworesults,providesusefulcoveragemeasurementinformationthatcanbeusedforresearchpurposes.Alloftheseresults,decisions,support-inganalyses,technicalassessments,andlimitationscan befoundontheCensusBureausWebsiteat www.census.gov/dmd/www/EscapRep.html.Thisdocumentisintendedtopromoteknowledgeandencouragecollaborationoncoveragemeasurementissues.Assuch,wewelcomecommentsandsuggestionsfromcolleaguesontechnicalissuesandalsoonthevalueof thisdocument.CharlesLouisKincannonDirector,U.S.CensusBureauU.S.CensusBureau SectionI:A.C.E.March2001 Chapters1.IntroductiontotheA.C.E.
........................
1-12.AccuracyandCoverageEvaluationOverview
.............
2-13.DesignoftheA.C.E.Sample
.......................
3-14.A.C.E.FieldandProcessingActivities
..................
4-15.TargetedExtendedSearch
........................
5-16.MissingDataProcedures
........................
6-17.DualSystemEstimation
.........................
7-18.Model-BasedEstimationforSmallAreas
................
8-1 AppendixesA.Census2000MissingData
.......................
A-1B.DemographicAnalysis
..........................
B-1C.WeightTrimming
............................
C-1D.ErrorProfileforA.C.E.Estimates
....................
D-1SectionIReferences
...............................
1SectionII:A.C.E.RevisionIIMarch2003 Chapters1.IntroductiontoA.C.E.RevisionII
....................
1-12.SummaryofA.C.E.RevisionIIMethodology
..............
2-13.CorrectingDataforMeasurementError
.................
3-14.A.C.E.RevisionIIMissingDataMethods
................
4-15.FurtherStudyofPersonDuplicationinCensus2000
.........5-16.A.C.E.RevisionIIEstimation
......................
6-17.AssessingtheEstimates
.........................
7-1SectionIIReferences
..............................
1 CONTENTS iv AccuracyandCoverageEvaluationofCensus2000:DesignandMethodologySectionIA.C.E.March2001U.S.CensusBureau,Census2000 Chapter1.IntroductiontotheA.C.E.
INTRODUCTIONTheU.S.CensusBureauconductedtheAccuracyandCov-erageEvaluation(A.C.E.)tomeasurethecoverageofthe populationinCensus2000andtoallowforthepossibilityofcorrectingthecensusresultsforthemeasuredunder-count.Italsoprovidesawealthofinformationonthe censusprocessandmay,thus,enableimprovementinfuturecensuses.Thisdocumentiswrittentoprovideaclearandpermanentrecordofthemethodsandopera-tionsusedinthisproject.ThecurrentchapterpresentstheobjectivesandscopeoftheA.C.E.,anddiscusseslimitationsofwhatitwasattemptingtoaccomplish.Itincludesabriefhistoryoftheevolutionofthestatisticalandoperationalmethodsupon whichtheA.C.E.isbased.Chapter2presentsanoverview ofthevariousstatisticalstepsnecessarytoproduceesti-matesofcensuscoverageandhowtheyaretiedintotheoperationofthesurvey.Thesequenceofmajoractivities andtheirtimingisgiven.SubsequentchaptersdiscussindetailA.C.E.sampling,interviewing,processing,andestimationsteps.
GoalsTheevaluationofthecompletenessofcensusenumera-tionhasbeenanintegralpartofthedecennialcensussincethe1950census.Thisevaluationhastakenonmanyformsincludingdemographicanalysis,administrative recordchecks,matchestoindependentsurveys,and dependentrecordrechecksandreinterviews.Theevaluationofthefivecensusesfrom1950to1990clearlyshowedthateachofthetraditionaldecennial censusesundercountedthetotalpopulation,andfurther,missedcertainidentifiablepopulationgroupsatgreaterratesthanothers.Specifically,theseevaluationsclearly showedthatundercountswerenotmerelyrandomoccur-rences,butpredictablebiasesinthecensustakingpro-cess.TheundercounthasbeenconsistentlyhigherfortheAfrican-Americanpopulationthanfortherestofthepopu-lation,andwhilethedatasetisnotsoextensive,theevi-dencealsopointedtoconsistentlyhigherundercountsforHispanics,Asians,PacificIslanders,andAmericanIndians thanfortheWhitenon-Hispanicpopulation.Theunder-countwasalsorelatedtosocioeconomicstatus,chieflymeasuredbyhomeownership,withrentershavingconsis-tentlyhigherundercounts.TheU.S.CensusBureau designedtheAccuracyandCoverageEvaluationtomea-surethisdifferentialundercountand,ifpossible,correctthecounts,therebymakingthecensusmoreaccurate.Asmentionedearlier,theA.C.E.wasdesignedtoservetwopurposes.Onegoalwastomeasurecoverageofthepopu-lation,bothtotalandinvariousmajorsubdivisionssuchasrace/ethnicity,sex,majorgeographicalareas,andsocioeconomicalgroupings.Thesemeasurementsindicatewhetherchangesmadeinenumerationmethodsinthe2000censusweresuccessfulinimprovingthecensusand showwhereimprovementsmaybenecessaryinfuturecensuses.Anothergoalwastoprovidedatathatcouldserveasthebasisforcorrectingthecensuscounts.InplanningtheA.C.E.,theCensusBureaufocusedontheaccuracyofpopulationtotalsforbothgeographicareasanddemographicgroups.Considerationwasgiventothepossibilityofbothimprovingthepopulationtotals (numericaccuracy)andpopulationshares(distributiveaccuracy).Althoughearlyplanningconsideredusingdualsystemestimationtoproduceaonenumbercensus,aftertheSupremeCourtruledontheuseofsamplingfor congressionalapportionmentin1999,thesurveywasredesignedandrefocusedonnon-apportionmentuses.Oneimportantusewascongressionalredistricting.Thus animportantconsiderationinthedesignwastoimprovetheaccuracyofcongressionaldistricts,whichaveragearound650,000people.TheU.S.CensusBureaualsorec-ognizedotheruses,includingstateandlocalredistricting,fundsallocation,andprogramadministration.Thetradi-tionalgoalsofcoverageevaluationtoinformusersand aidintheplanningofthenextcensuscontinuetobe important.Thesegoalsgreatlyinfluencedthesampleandestimationdesign.TheA.C.E.DefinedTheA.C.E.isapost-enumerationsurvey,basedonthetheoryofdualsystemestimation.Theresultsofthedualsystemestimationcanbeusedwithmodel-basedestima-tiontoproducecensusfilesadjustedforthemeasurednetundercount(ornetovercount).Thedesigninvolvedcomparing(matching)theinformationfromanindepen-dentsamplesurveytoinitialcensusrecords.Inthisprocess,theCensusBureauconductedfieldinterviewingandcomputerizedandclericalmatchingof records.Usingtheresultsofthismatching,theCensusBureauapplieddualsystemestimationtodevelopestimatesofcoverageforvariouspopulationgroups.The initialplansweretoapplycorrectionfactorstothecensusfilesthatcouldbeusedtoproduceallrequiredCensusSectionIChapter11-1IntroductiontotheA.C.E.U.S.CensusBureau,Census2000 2000tabulations,otherthanapportionment.Thecorrec-tionaspectofCensus2000tabulationswaslateraban-doned.TheA.C.E.canbesummarizedasfollows:*Selectastratifiedrandomsampleofblocksforthe A.C.E.*CreateanindependentlistofhousingunitsinthesampleofA.C.E.blocks.*Beginconductingtelephoneinterviewsofhousingunitsthatmailedinacompletedquestionnaireandthatcouldbeclearlylinkedtoatelephonenumber.*Aftertheinitialcensusnonresponsefollow-up,conductapersonalvisitinterviewateveryhousingunitonthe independentlistnotalreadyinterviewedbytelephone.*MatchtheresultsoftheA.C.E.interviewtothecensusandviceversa.*Searchthecensusrecordsforduplicates.
- Resolvecasesthatrequireadditionalinformationformatchingbyconductingapersonalvisitfollow-upinter-view.*Usetheinformationfromother,similarpeopletoimputemissinginformation.*CategorizetheA.C.E.databyage,sex,tenure,race/ethnicityandotherappropriatepredefinedvari-ablesintoestimationgroupingscalledpost-strata.*Calculatethecoveragecorrectionfactorsforeachpost-stratumusingthedualsystemestimator.*Ifappropriate,applythecoveragecorrectionfactorstocorrecttheinitialcensusdatausingamodel-basedesti-matorandtabulatethestatisticallycorrectedcensusresults.ThereareanumberofassumptionsinherentintheA.C.E.Properapplicationofthedualsystemestimation(DSE) modelrequirestheA.C.E.beconductedindependentlyofthecensusandthattherulesusedtodeterminecorrectenumerationsarethesameastherulesusedtodetermine caseseligibleformatches.TheDSEmodelcanbesensitive tomeasurementerrors.ItisimportanttoobtainconsistentreportingofCensusDayresidence.Inclusionoffictitiouspersonsanderrorsinmatchingcandirectlyinfluencethe DSE.Thereareotherassumptionsnecessaryindevelopingmodelsforhandlingnonresponseandothermissinginfor-mation.TheA.C.E.designwasbasedverymuchonthe theoreticalconceptsdiscussedandpubliclypresentedbytheCensusBureauinadvanceofthecensus.Thesecon-ceptsincludedcarefulattentiontostatisticalindepen-dence,astrictapplicationoftheconceptsofsufficient information,andcarefulattentiontobalancingthecon-ceptsusedtomeasurecensusmisses,aswellascensuserroneousinclusions.Foramoredetaileddiscussionof thisapproachseeHogan(2000).DesignLimitationsoftheA.C.E.TheA.C.E.wasdesignedtomeasurethehouseholdpopu-lationforlargesocial,economic,ethnic,racialandgeo-graphicgroupsandcomparethemwiththecensuscounts.
Theresultsprovideameasureofnetundercountanda mechanismtocorrectthatnetundercount,ifthatappears advisable.AlthoughthegoaloftheA.C.E.wastomeasure thenetundercount,italsoprovidesinformationonthe separatecomponentsofthenetundercountsuchasomis-sionsandvarioustypesoferroneousenumerationsinthe census.Measuresofgrosserrorcannotbeobtained directlyandexclusivelyfromthesecomponentsbecause ofthestrictdefinitionofcorrectthatisneededtoimple-mentthedualsystemestimator.Forexample,A.C.E.treats censusenumerationsasnotcorrectlyenumeratedifthey lackedsufficientinformationforaccuratematching.This requirementallowsformoreprecisematching,but increasesboththenumberofnonmatchingcasesandthe numberofcasescodedaserroneous.Asimilarstrictrule oncorrectblocklocationofanaddressalsoincreasesboth thenon-matchesanderroneousenumerations.Theserules maybeinapplicableinthecensusoutsidetheDSEcon-
text.ThedesignoftheA.C.E.doesnotprovideinformationonverylocaloruniqueerrorsinthecensusprocess.Specifi-cally,theA.C.E.wasnotdesignedtocorrectforparticularerrorsmadeby,say,acensustakeroralocalcensusman-ager,ortocorrectforlocalerrorsinthecensusaddress list.TheCensusBureauhadotherprogramsinplacetodealwiththeseissues,suchasthequalityassurancepro-cess,thecoverageimprovementfollow-up,andthelocal updateofcensusaddresses.TheA.C.E.wasdesigned, rather,tocorrectforlargesystematicerrorsincensustak-ing,mostespeciallythehistoricdifferentialundercount.Finally,theA.C.E.wasnotdesignedtomeasuretheunder-countforsomespecialpopulationgroupssuchasthegroupquarterspopulation(includingcollegedormitories, institutions,andmilitarybarracks),thepopulationthatuseshomelesssheltersand/orsoupkitchens,ortheremoteareasofAlaska.TheCensusBureauinstitutedspe-cializedproceduresforthesegroupsinordertoachieve thebestcountpossible.ExtendingtheA.C.E.methodstoallofthesepopulationswouldhavebeenverycostlyanddifficulttoimplementproperly.
HISTORYStartingwith1950,everycensushasincludedaformalstudyofthecoverageofthepopulation.The2000Accu-racyandCoverageEvaluation(A.C.E.)isverymuchacon-tinuationofthattradition.1950through1970TheU.S.CensusBureauconducteditsfirstpost-enumerationsurvey,orPES,aspartofthe1950census.
Theessentialelementsinapost-enumerationsurveyarea1-2SectionIChapter1IntroductiontotheA.C.E.U.S.CensusBureau,Census2000 secondattempttoenumerateasampleofhouseholdsand,usingcase-by-casematching,todeterminethenumber andcharacteristicsofpeoplenotincludedinthefirstcen-susenumeration.ThisfirstPESwasnotbasedondualsys-temestimation.DuringthenexttwodecadestheCensusBureauexperi-mentedwithalternativecoveragemeasurementmethodsbasedoncase-by-casematchingincludingaReverse RecordCheck,administrativerecordchecks,andamatchtotheCurrentPopulationSurvey.Inaddition,therewerevariousalternativeversionsofPESdesigns.Soonafterthecompletionofthe1950census,methodsofaggregatedemographicanalysisforcoverageanalysisweredevelopedatPrincetonUniversitybyAnsleyCoale andcolleagues.SeeCoale(1955),CoaleandRives(1973),andCoaleandZelnick(1963)fordetails.Demographicanalysis(DA)istheconstructionofanestimateofthe truepopulationusingbirth,death,migrationandotherdatasources.Thismethodologycanprovideindependentmeasuresofthecensusnetundercountbyage,sex,and Black/non-Black;however,itissubjecttoitsownlimita-tionsanduncertainties.AnimportantlimitationisthelackofdatatoindependentlyestimatetheHispanic,Asian,andAmericanIndianpopulationsorotherdetaileddemo-graphicgroups,suchashomeownersorrenters.Norcandemographicanalysisprovideestimatesforgeographicareasbelowthenationallevel.Inaddition,thelevelof emigrationandundocumentedimmigrationmustbeestimatedusingindirectmethods.SincetheU.S.onlyhadreasonablycompletebirthregistrationsince1935,sophis-ticatedanalysiswasneededin1950forthepopulation overage15.Earlystudieswererestrictedtothenative-bornWhitepopulation,butwithtimewereexpandedtoincludethenative-bornAfrican-Americanpopulationas
well.LaterworkattheU.S.CensusBureaubyJacobSiegelandcolleaguesexpandedtheestimatestothetotalpopulation, withthefirstofficialestimatesbeingissuedinconjunctionwiththe1970census(Siegel,1974).The1970estimatesrecognizedtheneedtoaddresstheproblemofracemis-classificationinthecompletecount.Bythetimeofthe1970census,thepopulationcoveredbybirthregistrationincludedthoseunderage35,withtestsofbirthregistra-tioncompletenesshavingbeenconductedin1940,1950, andthemid-1960s.Medicaredatanowprovidedabasisforestimatesforthoseoverage65.However,thedifficultyofmeasuringmigration,animpor-tantcomponentofDA,gainedattention.ThesestudiesnotedThefiguresonnetimmigrationforthe1960to1970decadeshouldbeconsideredasestimatessubjectto considerableerror.Importantly,theestimatesdidnotincludeanyallowancefor...unrecordedalienimmigration,particularlyillegalimmigration.SeeSiegel(1974)for moredetails.Duringthesesamedecades,themethodsofdualsystemestimationwerebeingrefinedforuseinthehumanpopu-lation.Althoughintroducedoveracenturyagoforusein animalpopulations,dualsystemestimationwasfirstused withhumanpopulationsinanimportantarticlebySekar andDeming(1949)thatappliedthetechniquetomeasur-ingbirths.Dualsystemestimationwaswidelyusedto measurebirthsanddeathsindevelopingcountriesduring the1970sinconjunctionwithimportantoperationaland theoreticalwork.Theideasfromdualsystemestimation soonappliedtopost-enumerationsurveys.Seemost importantlyMarks(1979).
1980ThedesignoftheA.C.E.tracesmostdirectlytothe1980Post-EnumerationProgram(PEP).Thiswasthefirstlargescalepost-enumerationsurveytousedualsystemestima-tion.Inaddition,itincludedseveralimportantinnova-tions,aswellasimportantlessonsonthedesignofaPES.The1980PEPwasbasedonamatchofpeopleincludedintheAprilandtheAugustCurrentPopulationSurveytothe1980census.Thismatchwasusedtodeterminethepro-portionofpeoplecountedinthecensus.Itwasasample ofpeopleknowntoexistandberesidentsoftheU.S.,andwaslabeledthePopulationorPsample.Allmatchingwasdonebyclerksandtechnicians.Inordertomakeitpossibletodothematching,eachpersons addressneededtobeassignedthecorrectcensusgeo-graphiccode(geocoded).Thisprocesswasslowanderrorprone.Inaddition,aseparatesampleofcensusrecordswasdrawn.ThiswasknownastheEnumerationorEsample.ThecensusrecordsincludedintheEsamplewerecheckedintheofficetoseeiftheywereduplicated,followedbya fieldoperationtodeterminewhetherthepeoplewerereal,livedattheaddressonCensusDay,andwhethertheunitwasassignedthecorrectcensusgeographiccode(cor-rectlygeocoded).Oneimportantconceptintroducedin1980wasthatofsufficientinformationformatching.Sufficientinformation formatchingmeansthatarecord,fromeitherthePorEsample,containssufficientinformation,includingmostimportantlyaname,toallowaccuratematchingand follow-up.Recordsthatlackthisinformationareremoved frommatching,processingandestimation.FortheEsample,thisexclusionisdoneintwoparts:censusimputedrecords(non-data-defined)areexcludedfrom thesamplingframe,andthensampleddata-definedrecordsarereviewedfornameandothernecessaryinfor-mation.Anotherconceptusedearlierbutmadeexplicitin1980wasthatofsearcharea.Apersonwasonlyconsideredcorrectlyenumeratedifhe/shewascountedinaspecific,SectionIChapter11-3IntroductiontotheA.C.E.U.S.CensusBureau,Census2000 definedareathatincludedtheaddresswherehe/sheshouldhavebeenenumerated.Thissearchareawasto beappliedtoboththePandtheEsamples.The1980PEPwasalso,veryimportantly,thefirstPEStobe,itself,carefullyevaluated(Fayetal.,1988).Thisevalu-ationprovedinvaluabletothedesignofthe1990PES.Amongtheimportantfindingswere:*Samplingvarianceswereveryhigh.*Geocodingasampleofhousingunitswascostlyanderrorprone.*DrawingindependentPandEsamplesmadeitveryhardtoapplythesameconcepts,especiallythatofsearch area.*Levelsofmissingdataneededtobereducedandmeth-odstoaccountforthemissingdataneededtoberefined.*Matchingneededtobemademoreaccurateandfaster.
- Anindependentsampleofpeoplelivingininstitutionsprovednearlyimpossibletomatchandprocess,bothbecausetheinterviewsreliedonthesamesetofadmin-istrativerecordsandbecauseadministratorsoften refusedtogivenames,eventotheCensusBureau.By1980,theprecisionofdemographicanalysisbenefitedfromthefactthatthepartofthepopulationnotcovered byeitheradequatebirthregistrationdataorMedicaredatawasnowreducedtoonlythose45to65(in1980).However,immigration,especiallyillegal/undocumented/
unauthorizedimmigration,remainedaproblem.Earlydemographicestimatesfor1980,whichagaindidnotcontainanallowanceforillegalimmigration,showedanet overcountofthepopulation.However,pathbreakingworkbyJeffPasselandcolleaguesproducedthefirstestimatesofthenumberofillegalimmigrantscountedinthecensus.
Thisworkwasgenerallyvalidatedwhendatafromthe ImmigrationReformandControlAct(IRCA)producedsimilarnumbersofimmigrantsapplyingforlegalization.Althoughthe1980PEPwasnotexplicitlydesignedtocorrectthecensusformeasuredundercount,itwasthe firstPEStobeconsideredinthiscontext.Increaseduseofcensusresultsforcongressional,state,andlocalredistrict-ing,aswellasforfederalfundsallocationhighlightedthe importanceofcensusaccuracy.Thevotingrightscasesofthe1960s(Bakerv.Carr(1962),Reynoldsv.Simms(1964))hadgreatlyincreasedtheimportanceofcensus datainredistricting.GeneralRevenueSharingfunds, distributedinpartbasedoncensusdata,becameanimportantsourceoflocalgovernmentrevenueinthemid1970s.Thelegalandstatisticalquestionswerediscussed inacademicjournalsandaspartofseverallawsuits,includinginfluentialsuitsbytheCityofDetroitandtheCityofNewYork.TheU.S.CensusBureauspositionwasthatthe1980PEPwasnotofsufficientaccuracyforthis purpose,andthisdecisionwasupheld.
1990Buildingontheknowledgegainedin1980,theCensusBureaumademajordesignchangesforthe1990PES.
Importantchangesincluded:*Excludinginstitutionalpopulationandmilitaryships/barracksfromtheuniverse.*Theuseofablocksampletiedtocensusgeographiccodes,withthesamesampleofblocksusedforboth thePandtheEsample.*Repeatedcall-backtoreducenonresponseandmissing data.*Acomputerandcomputer-assistedclericalmatching operation.*Amodeltoaccountformissingdatatakingintoaccounttheimportantcovariates.Thedesignoftheestimationcells(post-strata)wascom-pletelychanged.FollowingtheadviceofJohnTukeyandothers,theestimationcellswerenotrestrictedtoasingle state,butallowedtocrossstatelines.Thus,HispanicslivinginUtahcouldbecombinedwithHispanicslivinginColoradoandothermountainstatestoformoneestima-tioncell,ratherthanbeingcombinedwithnon-Hispanics livinginUtah.AsmoothingmodelwasusedtocombineinformationwithinCensusRegion.The1990PESwasexplicitlydesignedsothatitcouldbeusedtoadjustthecensusresults.Specifically,model-basedmethodsweredevelopedtocarrytheestimates downtothesmallestcensusgeographicunits(blocks)andtoincludepositiveornegativewholepersonrecordstoaccountforthemeasurednetundercountorovercount.
Thiscompletefilecouldthenbeaggregatedtoobtaindatathatwasconsistentforallgeographicallevels.Manylessonswerelearnedin1990,manyhavingtodowiththeneedfortightoperationalcontrolandtesting.
Oneimportantstatisticallessonconcernedtheuseofthestatisticalsmoothingmethods.Thesemethodsbecamehighlycontroversialandbecamethefocusofmuchstatis-ticalanalysisanddebate.Theywerenotwellunderstood andtheU.S.CensusBureaudecidedtodroptheuseofsmoothingandinsteadrecomputetheresultswithfewerandthuslargerestimationcells.Demographicanalysisestimateswentverysmoothlyin1990withbirthregistrationandMedicaredatacoveringallbutthoseage55to65.TheIRCAdataandtheworkofJeffPassellandothers(seeFayetal.,1988,Chapters21-4SectionIChapter1IntroductiontotheA.C.E.U.S.CensusBureau,Census2000 and3)providedanallowanceforundocumentedimmi-grants.Further,forthefirsttime,theCensusBureaupro-ducedexplicitallowancesfortheuncertaintyinthedemo-graphicanalysisestimates.Thisanalysisshowedthatthe preferredorpointdemographicanalysisestimates tendedtofallatthelowerendoftheuncertaintyrange.
However,thismethodofexpressingtheuncertaintyrange cameundercriticismfromoutsidetheCensusBureau.
LimitationsofthismethodaredocumentedinRobinsonet al.(1993)andHimesandClogg(1992).The1990DemographicAnalysisestimateswereingeneralagreementwiththeresultsofthe1990PES.Atthenationallevelthetwoestimateswereveryclose1.8percentundercountfordemographicanalysis(later revisedto1.7percent)and1.6percentforthePES.Atmoredetailedlevels,differencesemerged,especiallythetendencyforthePEStogreatlyunderestimatetheunder-countforadultAfrican-Americanmales.Takingintoaccountwhatwasknownaboutthebiasesanduncertain-tiesofeach,itseemedclearthatbothweremeasuringa realdifferentialundercounteventhoughPESwasunderes-timatingtheamountforadultAfrican-Americanmales.
2000Intheearly1990s,taskforcesandNationalAcademyofSciencePanelssuggestedthatthedifferentialundercountinthecensuscouldnotbereducedwithoutelaborate enumerationandmatchingprocedures,whicharetoo costlytobecarriedoutexceptonasampleofthepopula-tion.Inthe1995and1996CensusTests,analternativeCensusPlusmethodologywascomparedtotheDSE.The performanceoftheDSEwasbetterandsubsequentresearcheffortsfocusedonimprovingtheDSE.Conse-quently,mostoftheA.C.E.designcanbeseenasa continuationandrefinementofthe1990PESdesign.
Amongtheimportantrefinementsare:*Muchlargerandbetterdesignedblocksample.*Earlierinterviewing,includingtheuseofearlytelephone interviewing.*Computer-assisted(laptop)telephoneandpersonal interviewing.*Morerefinedestimationcells(post-strata).
- Explicitcollapsingrulestoaccountforsmallcellsize.
- Explicitweighttrimmingrulesincaseofextraordinary(outlier)cells.Thesurveyuniversewasrestrictedtothehousingunit/householdpopulation.Allgroupquarters,notjustmilitaryandinstitutional,populationswereexcluded.
Consequently,theA.C.E.estimateofcoverageerrorwillbeunderestimatedtotheextenttherewereerrorsinthegroupquarterspopulation.AnotherconcernisthetreatmentintheDSEofcasesinvolvedintheHousingUnitDuplicationOperation (referredtoaslatecensusadds)andthelevelofwhole personimputationsinthecensus.Theserecordswere notincludedintheA.C.E.matching,processing,or follow-upprocesses.Theywerealsoexcludedfromthe DSE,althoughproperlyaccountedforincomputingthe netundercount.Itispossiblethat,hadtheserecords beenincludedintheA.C.E.andtheDSE,theestimated undercountwouldhavediffered.Thenumberof excludedrecordsismuchlargerthanitwasin1990.If theratioofmatchestocorrectenumerationsisthe samefortheexcludedandincludedcases,theDSE expectedvalueshouldbenearlythesame.However,if thepeoplereferredtointhecorrectcaseswereeither muchmorelikelytohavebeenincludedintheA.C.E.or muchlesslikelytohavebeenincluded,thenexcluding thesecasesfromtheA.C.E.wouldhavechangedthe levelofcorrelationbiasandaffectedtheA.C.E.For moredetail,seeHogan(2001).TherewasachangeinthetreatmentofpeoplewhohadmovedbetweenApril1andthetimeofthePESinter-view.In1980and1990,thesemoversweresampled attheircurrent(i.e.PESInterviewDay)address.IntheA.C.E.,theyweresampledattheirCensusDay,April1,address.Althoughconceptuallymuchthesame,theimplementa-tionofthesearchareawasverydifferent.In1990,the entiresearchareawasalwaystobesearchedforallcasesinordertofindmatchesorduplicates,andallcasesweremap-spottedtodeterminewhetherthey wereinsidethesearcharea.In2000,thesearchofthe surroundingblockswasrestrictedbybothtargetingandsampling.First,thesurroundingblockwassearchedforonlycertainkindsofcases,specificallycaseswhere therewasalikelihoodofgeocodingerrorinthebasiccensusprocess.Inaddition,astratifiedsub-samplewastakenforthissearch,withonlysomeoftheinitial sampleblockssubjectedtothisextendedsearch.This processwasknownasTargetedExtendedSearch,or TES.Becauseofthedifficultyinexplaininganddefendingthe1990smoothingmethods,smoothingmodelswerenotemployed.Instead,theA.C.E.relieduponalarger samplesizeandamorerefinedsetofestimationcellstoproduceestimates.Finally,althoughthiswasnotaseparatestep,theA.C.E.wassubjectedtomuchmoreexactingspecification,documentationandtestingthananypreviouscoverage measurementstudy.MuchoftheoperationalsuccessoftheA.C.E.canbetracedtothecareandattentiongiventodocumentationandtesting.SectionIChapter11-5IntroductiontotheA.C.E.U.S.CensusBureau,Census2000 ThisdocumentisthenverymuchpartoftheoverallA.C.E.process.Itattemptstodocument,conciselyandclearlyaswellaspreciselyandaccurately,theA.C.E.
design.1-6SectionIChapter1IntroductiontotheA.C.E.U.S.CensusBureau,Census2000 Chapter2.AccuracyandCoverageEvaluationOverview INTRODUCTIONTheAccuracyandCoverageEvaluationSurvey(A.C.E.)wasdesignedprimarilytomeasurethenetundercoverageorovercoverageinthecensusenumeration.Themethodol-ogyusedwasdualsystemestimationthatrequirestwo independentsystemsofmeasurement.ThePsampleorPopulationsamplemeasuredthehousingunitpopula-tion,asdidthecensus,butwasconductedindependently ofthecensus.Thiswasdonebyselectingasampleofblockclusters,geographicallycontiguousgroupsofblocks,andinterviewinghousingunitsthatwereobtained byindependentlycanvassingeachblockcluster.TheresultsofthePsamplewerematchedtocensusenumerationstodeterminetheomissionrateinthe census.Additionally,asampleofcensusenumerations, theEsample,wasselectedtomeasuretheerroneousenu-merationrateinthecensus.TheEsamplewascomprisedofcensusenumerationsinthesamesampleblockclusters asthePsample.Theseoverlappingsamplesreducedvarianceonthedualsystemestimator,reducedtheamountoffieldactivitiesandtheircost,andresultedin efficientdataprocessing.TherewereconsiderablechallengesintheimplementationoftheA.C.E.OneoftherequirementsoftheA.C.E.wastoproducemeasuresofnetundercountorovercountshortly afterthecensuscountswerecompiled.Thiswasadaunt-ingtaskbecausetherequirementforindependencemeantthatA.C.E.activitiescouldnotinterfere,orinanyway affecttheresultsofthecensusenumerations,orvice versa.Aswithmostsurveys,theA.C.E.consistedofdesigningasample,creatingaframe,selectingthesample,conductingtheinterviews,dealingwithnonre-sponsesandmissinginformation,aswellasproducingtheestimates.Inaddition,theA.C.E.hadseveralmatchingandfieldfollow-upactivities.Inordertoaccomplishthese tasksandmeetthegoalsoftheA.C.E.inatimelymanner, itsdesignwasuniquelybuiltaroundcensusoperations.Additionally,toensurequalitywithsuchacompressedtimeschedule,itwasessentialthatsoftwaresystemsbe writtenandthoroughlytestedpriortothestartofanactivity.OnecensusoperationthathadmajorinfluenceontheA.C.E.designandestimationplanwastheHousingUnit DuplicationOperation.Asthecensusquestionnaireswerebeingprocessed,theCensusBureaususpectedthattherewasasignificantnumberofduplicateaddressesinthe censusfiles.Toaddressthesuspectedhousingunitduplication,theHousingUnitDuplicationOperationwasintroducedinthefallof2000.SeeNash(2000)forfurther details.Theprimarygoalofthiscensusoperationwastoimprovethequalityofthecensus;however,itsdesignallowedtheA.C.E.operationstoproceed.Essentially,sus-pectedduplicatehousingunitsweretemporarilyremoved fromthecensusfiles,whilefurtheranalysiswasdoneforthesecases.Approximately5.9millionpersonrecordswereinthesesuspectedduplicatehousingunits,which were:1)out-of-scopefortheE-samplecomponentoftheA.C.E.,2)notavailableforthepersonmatchingincludingtheidentificationofpersonduplicatesintheEsample,and 3)excludedfromthecensuscomponentinthedualsys-temestimates.Approximately2.3millionpersonrecordswerereinstatedintothecensusaftertheEsamplewas selectedandwerereflectedinthenetcoverageestimates.
Hogan(2001)showedthatexcludingthesepersonrecordsfromtheA.C.E.wouldnotaffectthedualsystemesti-mates,ifthenumberofP-samplematcheswasreduced proportionatelytothenumberofE-samplecorrectenu-merations.ThischaptersummarizesthemajoractivitiesoftheA.C.E.andindicatestheirrelationshiptothecensus.SubsequentchaptersgointoconsiderablygreaterdetailaboutthemethodologyoftheA.C.E.andareorganizedasfollows:*Chapter3.DesignoftheA.C.E.Sample*Chapter4.A.C.E.FieldandProcessingActivities
- Chapter5.TargetedExtendedSearch
- Chapter6.MissingDataProcedures
- Chapter7.DualSystemEstimation
- Chapter8.Model-BasedEstimationforSmallAreas TheintentofthischapteristoprovideabroadcontextforthedesignoftheA.C.E.Herewegiveasequentialaccountingoftheseactivities.Table2-1givestheorderin whichtheA.C.E.activitiesoccurredandmapstheactivi-tiestothechapterwhereeachisdiscussedinfurtherdetail.Thistableshowsthesubstantialintegrationofthe samplingandoperationalactivities.Figure2-1showsthe flowofthemajoractivities.SectionIChapter22-1AccuracyandCoverageEvaluationOverviewU.S.CensusBureau,Census2000 Table2-1.SequenceofA.C.E.ActivitiesActivityDescriptionChapter(s)1First-phasesampling32Independentlisting4 3Second-phasesampling3 4Initialhousingunitmatching/fieldfollow-up4 5Targetedextendedsearch4&5 6Subsamplingwithinlargeblockclusters3 7A.C.E.personinterviewing4 8E-sampleidentification3&4 9Personmatchingandfieldfollow-up4 10Missingdataprocessing6 11Dualsystemestimation7 12Model-basedestimationforsmallareas8Table2-2furtherillustratestheintegrationofthesamplingactivitiesandoperationsbysummarizingthesamplesizeateachphaseofsamplingandtheoperationsforwhichthesampleisaninput.Thedatacollectedfromeach operationisinputtothenextsamplingoperation.Forexample,thefirstphaseofsamplingresultedin29,136sampleareaswithalmost2millionhousingunits.
Independentaddresslistswerecreatedfortheseareas.Theresultsoftheindependentlistingwereusedinthesecondphaseofsampling.Activity1.First-PhaseSamplingTiming:MarchthroughJune,1999;priortothecreationofthecensusaddresslist.AtthetimeoftheJanuary,1999SupremeCourtrulingagainsttheuseofsamplingforapportionment,theCensusBureauwasheavilyinvolvedinthefirstphasesofsamplingfortheIntegratedCoverageMeasurement(ICM).
ThegoaloftheICMwastoproducereliableestimatesof coverageofeachstatestotalpopulation,andthisrequiredaverylargesample-a750,000housingunitsamplewasplanned.AsaresultoftheSupremeCourtruling,state populationestimatesforapportionmentwerenolongerkeyestimatesofthecoveragesurvey;instead,thegoalwastomeasurecensuscoveragefornationalandsubna-tionalpopulationdomainshavingdifferentcensuscover-ageproperties.Theseestimatescouldbemeasuredwithsufficientprecisionwithasampleofabout300,000 housingunits.Ratherthanabandoningtheeffort,i.e.,softwaredevelop-ment,etc.,thathadalreadybeeninvestedintheICM,it wasmoreefficient,particularlyfromasoftwarequality perspective,tocompletethesamplingfortheICM,and thenselectasubsamplefortheA.C.E.Theinfrastructure forthefieldstaffwasbeingdeployedinpreparationfor thefirstfieldoperationthatstartedinSeptember,1999, andthedevelopmentofthesamplingsystemthatwas scheduledtobeginproductioninMarch,1999waswell underway.Therewasnotadequatetimetoredesignthe A.C.E.sampleallocationentirely,selectthesample, producethedifferentlistingmaterialsincludingmaps, conductthelistingasscheduled,andensureahighlevel ofqualityinarevisedsoftwaresystem.Consequently,the A.C.E.sampledesignwasderivedfromtheICMdesign usingadoublesamplingapproach.TheentireICMsample wasselectedasoriginallyplannedandthenreduced throughvariousstepstoyieldtheA.C.E.targethousing unitsample.Thefirst-phasesamplingconsistedof:*Formingprimarysamplingunits.*Stratifyingprimarysamplingunits.*Systematicsamplingofprimarysamplingunits.TheA.C.E.primarysamplingunitwastheblockcluster,agroupofoneormoregeographicallycontiguouscensusblocks.Tomakeefficientfieldworkloads,thetargetsize ofblockclusterswasabout30housingunits,although blockclustersvariedinsize.Withineachstate,blockclusterswerestratifiedbysizeusinghousingunitcountsfromapreliminarycensusaddresslist:small(0to2 housingunits),medium(3to79housingunits),andlarge(80ormorehousingunits).Somestatesincludedasepa-ratesamplingstratumforAmericanIndianReservations.
Withineachsamplingstratum,asystematicsampleof blockclusterswasselectedwithequalprobability.Thisphaseofsamplingyielded29,136blockclusterswithanestimated2millionhousingunitsinthe50statesandtheDistrictofColumbia.Table2-2.SampleSizesbySamplingPhaseandOperationSamplingphaseSamplesize OperationsAreasHousingunitsFirst-phase...
...............................29,1361,989,000Independentlisting Second-phase...............................11,303844,000Initialhousingunitmatching/follow-upSubsamplingwithinlargecluster(P-sample)....11,303301,000A.C.E.personinterviewing,personmatching/follow-up,dualsystemestimationE-sampleidentification.......................11,303311,000Personmatching/follow-up,dualsystemestimation2-2SectionIChapter2AccuracyandCoverageEvaluationOverviewU.S.CensusBureau,Census2000 Activity2:IndependentListingTiming:Septemberthroughearly-December,1999;wellbeforecensusenumerationbegan.Fieldstaffvisitedthesampleblockclustersandcreatedanindependentaddresslistofallhousingunits,including housingunitsatspecialplaces.ThegoalofthisoperationwastocreateanindependentaddressframeofallthehousingunitsthatwerelikelytoexistonCensusDay, April1,2000.SincethisoperationoccurredpriortoCensusDay,anypotentialhousingunitstructureswereincludedontheindependentaddresslist.Later,during housingunitfollow-up,thesestructureswerevisitedtoconfirmthattheyactuallycontainedhousingunitsonCensusDay.Sincehousingunitscouldnotbeadded totheindependentaddressframeinthislateroperation,butcouldberemoved,itwasimportanttoincludestruc-tureswithquestionablehousingunitstatusduringthe independentlisting.Thislistingconsistedofapproximately2millionhousingunitsorpotentialhousingunitsinthe50statesandthe DistrictofColumbia.Activity3:Second-PhaseSamplingTiming:December,1999throughFebruary,2000;priortomailingthecensusquestionnaire.ThesecondphaseofsamplingselectedblockclustersfromthefirstphasetobethefinalA.C.E.sampleareas.
Blockclusterswerestratifiedusingtwohousingunitcounts:1)acountfromtheindependentlistingoperation,and2)acountfromtheupdatedcensusaddresslistasof January,2000.Itwasimportanttoreducethefirst-phasesamplebeforethenextoperations,thehousingunitmatchingandfieldfollow-up,toreducethenumberof clustersgoingintothoseoperations.Thestratificationoftheblockclusterswasdoneseparatelybyfirst-phasesam-plingstrata:1)mediumandlargestrata,and2)small strata.Allfirst-phaseclustersfromtheAmericanIndian Reservationstratumwereretainedinthesecond-phase sample.Mediumandlargestrata.Theresultingnationalsampleallocationwasroughlyproportionaltostatepopulation withsomedifferentialsamplingwithinstates.Thetwogoalsofthedifferentialsamplingwere:1)toprovidesufficientsampletosupportreliableestimatesforseveral sub-populations,and2)toreducethevariancecontribu-tionduetoclusterswiththepotentialforhighomissionorerroneousenumerationrates.Theseclusterswereiden-tifiedandputintoseparatesamplingstratabycomparing theconsistencyofhousingunitcountsbetweentheinde-pendentlistandtheupdatedcensuslistforeachcluster.Smallclusterstratum.Conductinginterviewsandfollow-upoperationsinsmallblockclustersismuchmore costlyperhousingunitthaninmediumorlargeblockclusters.Lowersamplingrateswere,therefore,usedinthisstratum.However,twoconsiderationsweretakeninto accountinestablishingthelowerrates.Onegoalwasto avoidhavingsmallclusterswithanoverallprobabilityof selectionmuchlowerthantheprobabilityofselectionof otherclustersinthesample.Asecondgoalwastohave higherprobabilitiesofselectionforsmallclustersinwhich thenumberofhousingunitswasgreaterthanthe expected0to2housingunits.Thesetwogoalsattempted toreducethecontributionofsmallclusterstothevariance ofthedualsystemestimates.Smallblockclusterswiththe potentialforhigherroneousenumerationornonmatch rateswereretainedathigherrates.Thesecond-phase samplecontained11,303blockclustersforthe50states andtheDistrictofColumbia.Activity4:InitialHousingUnitMatchingandField Follow-UpTiming:FebruarythroughApril,2000;priortocensusnonresponsefollow-up.Theobjectivesoftheseoperationswere:1.CreatealistofconfirmedA.C.E.housingunitsinorder to:*obtainthebestlistofhousingunitstofacilitateper-soninterviewinginlateractivities.*havebettercontrolofthefinalA.C.E.housingunitsamplesize.2.EstablishalinkbetweentheA.C.E.andcensushous-ingunitsinorderto:*identifytheA.C.E.housingunitseligiblefortele-phoneinterviewing.*facilitateoverlappingPandEsamples.3.Identifypotentialgeocodingerrorsinorderto:*establishthetargetedextendedsearchsampling frame.*identifysampleareasforwhichthecreationofanewindependentaddresslist,orrelisting,wasnecessary.Housingunitmatching.Thehousingunitsonthecen-susaddresslistinJanuary,2000werematchedtothe A.C.E.independentaddresslist.First,theaddresseswere computermatched.Thecomputermatchingwasfollowedbyaclericalreviewofthecomputermatchresultsinanautomatedenvironmentintendedtofindadditional matchesusingsupplementalmaterials.Therewasalsoaclericalsearch,limitedtotheblockcluster,forduplicatehousingunitsduringthisphaseofthematching.Possible duplicatesinboththeA.C.E.andthecensuswereidenti-
fied.SectionIChapter22-3AccuracyandCoverageEvaluationOverviewU.S.CensusBureau,Census2000 Housingunitfollow-up.Insomecases,thecomputerandclericalmatchingwerenotabletodeterminethesta-tusofahousingunit.Fieldstaffvisitedthesecasestoget moreinformationaboutthesehousingunits.Aftermatch-ing,thecaseswhichwerenotmatched,possiblymatched, orpossibleduplicatesweresenttothefieldforfollow-up interviews.Someofthematchedcaseswerealsosentfor additionalinformation.Thefieldfollow-upwasdesigned todetermineifahousingunitexisted,ifitexistedinthe blockcluster,orifdifferentaddresseswerereferringto thesamehousingunit.Activity5:TargetedExtendedSearchTiming:May,2000.Thetargetedextendedsearchwasdesignedtoimprovetheaccuracyofthedualsystemestimatebysearchingfor matches,correctenumerationsandduplicatesoneringbeyondthesampleblockcluster.TheoperationwasimplementedinasubsetofA.C.E.blockclustersselected throughacombinationofcertaintyandprobability sampling.Therearecensusgeocodingerrorsofexclusionandinclu-sionintheA.C.E.sampleblockclusters.Censusgeocod-ingerrorsofexclusion(i.e.,housingunitsmiscodedinthe censussotheyappeartobeoutsidetheA.C.E.blockcluster)affecttheP-samplematchrate.Censusgeocodingerrorsofinclusion(i.e.,housingunitsmiscodedinthe censustoappearinsidetheblockcluster)affecttheerro-neousenumerationrateinthecensusorEsample.Ifthecensushousingunitisomittedfromthesampleblockcluster,theP-samplehouseholdcannotbematched.This yieldsalowermatchrate.OntheE-sampleside,ifahous-ingunitisincludedinthesampleblockclusterduetoageocodingerror,theE-samplepeoplewillbeconsidered erroneouslyenumerated.Theprimarymotivationforusinganextendedsearchareawastoreducethesamplingvarianceofthedualsystem estimatesduetocensusgeocodingerror.EventhoughtheextendedsearchallowedmoreP-samplepeopletobematchedandmoreE-samplepeopletobeconvertedtocorrectenumerations,theexpectedvalueofthedualsys-temestimateshouldnotbeaffectedaslongasthetwo samplesweretreatedequallywithrespecttothesearcharea.Anotherbenefitisthattheextendedsearchmakesthedualsystemestimatemorerobustbyprotecting againstpotentialbiasduetoP-samplegeocodingerror.Previouscensusevaluationshaveshownthatgeocodingerrorsarehighlyclustered.Thetargetedextendedsearchwasdesignedtotakeadvantageofthedistributionof geocodingerrorsbyfocusingonthoseclustersthatcon-tainthemostpotentialgeocodingerrors.Theimplementa-tionofthisoperationresultedindualsystemestimates withmoreprecision.TheinitialhousingunitmatchingresultswereusedtoidentifytheA.C.E.housingunitnonmatchesandcensus housingunitgeocodingerrors.ClusterswithoutA.C.E.
housingunitnonmatchesorcensusgeocodingerrorswere out-of-scopeforthetargetedextendedsearchsampling.
Changestothecensusinventoryofhousingunitsafter January,2000werenotreflectedinthehousingunit matchingusedtoidentifytargetedextendedsearch
clusters.Onlywholehouseholdsofnonmatchedpeoplewereeligiblefortheextendedsearchduringpersonmatching.Partialhouseholdnonmatches(i.e.,somehouseholdmem-berswerematches)werenotaslikelytoindicatethatthehousingunitwasageocodingerror.Activity6:SubsamplingWithinLargeBlock ClustersTiming:AprilandMay,2000;duringcensusnonresponse follow-up.SubsamplingwasusedinlargeblockclustersforthefinalselectionofhousingunitstoparticipateinthePsample.
Theobjectivewastoreducecostsandyieldmanageable fieldworkloadswithoutseriouslyaffectingtheprecisionoftheA.C.E.bytakingadvantageofthehighintra-classcorrelationexpectedinlargeblockclusters.Sincethe largeblockclustershadahigherinitialprobabilityofselectionthanmediumblockclusters,thereductioninsamplesizehadafairlyminoreffectontheprecisionof theA.C.E.estimates.Thesubsamplingofhousingunitswithinlargeclustersbroughttheoverallprobabilityofselectionofthesehousingunitsmoreinlinewithhousing unitsinthemediumclusters.Anyblockclusterwith80ormoreconfirmedA.C.E.hous-ingunits,basedontheinitialhousingunitmatch,was eligibleforthishousingunitreduction.Thereductionof housingunitswithinalargeblockclusterwasdonebyforminggroupsofadjacenthousingunits,calledseg-ments,andselectingoneormoresegmentsforA.C.E.
personinterviewing.Thesegmentshadroughlyequalnumbersofhousingunitswithinablockcluster.Segmentsofhousingunitswereusedasthesamplingunitinorder toobtaincompactinterviewingworkloadsandtofacilitate theidentificationofanoverlappingEsample.TheA.C.E.housingunitsthatwereretainedafterallofthesubsam-plingcomprisethePsample.Afterthereductionofhousingunitswithinlargeblockclusterswascompleted,theA.C.E.interviewsamplesizeforthe50statesandtheDistrictofColumbiawasapproxi-mately300,000housingunits.Activity7:A.C.E.PersonInterviewingTiming:Aprilthroughmid-June,2000forthetelephonephase;Mid-Junethroughmid-September,2000fortheper-sonalvisitphase;aftercensusenumerationwascomplete.2-4SectionIChapter2AccuracyandCoverageEvaluationOverviewU.S.CensusBureau,Census2000 ThegoaloftheA.C.E.personinterviewwastoprovidealistofpersonswholivedatthesampleaddressonCensus Day,aswellasthosewholivedattheaddressatthetime ofA.C.E.interviewing.TheA.C.E.personinterviewwas conductedusingaComputerAssistedPersonalInterview (CAPI)instrument.Togetanearlystartoninterviewing,atelephoneinter-viewwasconductedathouseholdsforwhichthecensusquestionnairewasdata-capturedandincludedatelephone number.Bothhouseholdswithmailreturnsandenumerator-filledquestionnaireswereeligiblefortele-phoneinterviews.Certaintypesofhousingunits,suchas thosewithouthousenumberandstreetname,werenoteligibleforatelephoneinterview.Allremaininginterviewsfollowingthetelephoneoperationwereconductedinper-son.However,somenonresponseconversionoperationinterviewsandinterviewsingatedcommunitiesorsecuredbuildingswereconductedbytelephone.Thepersoninterviewwasconductedonlywithahouse-holdmemberduringthefirst3weeksofinterviewing.Ifaninterviewwithahouseholdmemberwasnotobtainedafter3weeks,aninterviewwithanonhouseholdmember wasattempted.Thiswascalledaproxyinterview.Proxyinterviewswereallowedduringtheremainderoftheinter-viewingperiod.Duringthelast2weeksofinterviewinga nonresponseconversionoperationwasattemptedforthe noninterviewsusinginterviewerswhowereconsideredtobethebestavailable.Activity8:E-SampleIdentificationTiming:October,2000.
TheEsampleconsistedofthecensusenumerationsinthesamesampleareasasthePsample.Alldata-definedcen-suspersonrecordsintheA.C.E.blockclusterswereeli-gibletobeintheEsample.
1Tobeacensusdata-definedperson,thepersonrecordmusthavetwo100-percentdataitemsfilled.Namewasnotrequiredfortheperson recordtobeconsidereddata-defined,butcouldbeoneofthetwoitemsrequiredtobedata-defined.LikethePsample,itwassometimesnecessarytosubsamplethe censushousingunitsinaclusterwhenitcontainedalargenumberofcensushousingunits.ThegoaloftheE-sampleidentificationwastocreateoverlappingPandEsamplesin anefforttoreducepersonfollow-upworkloads.Anover-lappingPandEsampleisnotnecessary,butimprovesboththecosteffectivenessofthesubsequentoperationandtheprecisionofthedualsystemestimates.Ifablockclusterhadfewerthan80censushousingunits,thenallofthecensushousingunitsintheblockclusterwereintheEsample.Forblockclusterswith80ormorecensushousingunits,thewithin-clustersegmentsofadja-centhousingunitsdefinedfortheP-samplereduction weremappedontothecensusrecords.Thiswaspossible whenalinkbetweenthecensusandA.C.E.housingunit wasestablishedduringtheinitialhousingunitmatching.
Usingspecificrules,censushousingunitsthatdidnot havethislinkwereassignedtoasegment.Thesegment selectedforthePsamplewasselectedfortheEsample.If thesamplesegmentcontained80ormorecensushousing unitswithnoestablishedlinktoanA.C.E.housingunit, thenasystematicsampleofthesehousingunitswas selectedtoreducetheE-samplepersonfollow-upwork-
loads.Thisresultedinapproximately311,000censushousingunitsintheEsampleforthe50statesandtheDistrictof Columbia.Activity9:PersonMatchingandFieldFollow-UpTiming:OctoberandNovember,2000.Insufficientinformationformatching.Ruleswereestablishedfordeterminingwhichpersonrecordshadsufficientinformationformatching.Theseruleswere establishedandappliedbeforethestartofthematchingoperationtoavoidintroducingpotentialbiasintothematchingresults.BoththePandEsamplesusedthesame rules.Eachpersonrecordrequiredacompletenameand twoothercharacteristics.Personmatching.AllP-samplepersonswholivedateachsamplehousingunitonCensusDaywerematchedto thepeopleenumeratedinthecensustoestimatethematchrate.CensuspersonsintheEsamplewhomatchedtothePsamplewereconsideredtobecorrectlyenumer-ated.TheE-samplepersonrecordsthatdidnotmatchto thePsamplewereinterviewedduringfieldfollow-upoperationstoclassifythemascorrectlyorerroneouslyenumerated.Thismatchingwasacomputeroperation withclericalreview.Variablessuchasname,address,dateofbirth,age,sex,race,Hispanicorigin,andrelationshipwereusedtoidentifymatchesbetweenthePsampleand censusenumerations.Duplicateswereidentifiedinboth thePsampleandEsample.Ifacasequalifiedfortargetedextendedsearch,thesearchformatchesandduplicateswasextendedtotheringbeyondthesampleblockcluster.Personfollow-up.Thepersonfollow-upinterviewcol-lectedadditionalinformationthatwassometimesneces-saryfortheaccuratecodingoftheresidencestatusofthe nonmatchedP-samplepeopleandtheenumerationstatus ofthenonmatchedE-samplepeople.ThegoalofthisoperationwastoconfirmthatambiguousP-samplenon-matchesactuallylivedinthesampleblockclusteron CensusDay.Thus,follow-upinterviewsforP-samplenon-matchedcaseswerecarriedoutwhentherewasapossibil-itytheresidencestatuswasnotcorrect.Similarly, 1Excludesdata-definedpersonrecordstemporarilyremovedfromthecensus.SectionIChapter22-5AccuracyandCoverageEvaluationOverviewU.S.CensusBureau,Census2000 E-samplenonmatchcasesweresubjecttofollow-upinter-viewstodetermineiftheywerecorrectlyorerroneously enumeratedintheblockcluster.Possiblematcheswere interviewedtoresolvetheirmatchstatus.Therewerealso othercasessenttofollow-up,suchasmatchedpeople withunresolvedresidencestatusandothertypesofcases consideredtohavethepotentialforgeographicerrorsin thePsample.Thepersonfollow-upinterviewuseda paperquestionnaire.Interviewersgatheredinformation thatpermittedeachpersontobecodedasamatched resident/nonresidentoranonmatchedresident/
nonresidentoftheblockclusteronCensusDay.Therewas considerableemphasisonobtainingaknowledgeable respondentbeforethefollow-upquestionswereasked.
Afterthefollow-upinterviewwascompleted,theresults werereviewedbyclerkswhoassignedfinalstatustothese casesusinganautomatedsystem.Activity10:MissingDataProcessingTiming:December,2000throughtheearlypartofJanuary,2001.SincetheresultsofthematchingoperationweretobeusedintheestimationphaseoftheA.C.E.,itwasneces-sarytodeterminethematch,correctenumerationand residencestatusofallsamplecases.Whenthesecouldnotberesolvedthroughcomputerandclericalmatchingorthroughfieldfollow-upinterviews,thematch,correctenu-meration,orresidenceprobabilitieswereimputedbased onthedistributionofoutcomesoftheresolvedfollow-upinterviews.Also,asinthecensus,somerespondentsdidnotanswerallthequestionsintheA.C.E.interviewwhich wereneededforestimation.Ifthevariablestenure,sex,race,Hispanicorigin,oragewereblankforP-sampleindi-viduals,themissinginformationwasimputedbasedon thedistributionofthevariablewithinthehousehold,the overalldistributionofthevariable,orusinghot-deckmethods,dependingonthevariable.Imputationformiss-inginformationintheEsamplewasresolvedinthecensus processing.Finally,anoninterviewadjustmentwasmadetoaccountfortheweightsofhouseholdsthatshouldhavebeeninterviewedinA.C.E.,butwerenot.Activity11:DualSystemEstimationTiming:LateJanuary,2001.Dualsystemestimationwasusedtoestimatethenetundercountorovercountofthehouseholdpopulation includedinthecensus.CoverageestimatesofpersonslivingingroupquartersorinRemoteAlaskaareaswerenotmade.Thetermdualsystemestimationisusedbecausedatafromtwoindependentsystemsarecombinedtomeasurethesamepopulation.Aftermatchingtothecensus,thePsamplewasusedtomeasuretheomissionrateinthe census.TheEsamplewasusedtomeasuretheerroneousenumerationrateinthecensus.Thedualsystemestimatorassumesthatallpersonshavethesameprobabilityof beingcapturedinthecensus.Thisisobviouslyanover-simplificationoftheexistingsituation.Post-stratification sharplyreducedthelikelihoodthatthisassumptionwould biastheresults,sinceitonlyrequiresequalcaptureprob-abilitieswithinpost-strata.
Post-stratification.Dualsystemestimationwasusedtocalculatetheproportionofpersonsmissedineachofanumberofrelativelyhomogeneouspopulationgroups calledpost-strata.Thepost-stratafortheCensus2000A.C.E.weredefinedbythevariables:race/Hispanicorigindomain,age/sex,tenure,censusregion,metropolitan statisticalareasize/typeofenumerationarea,andcensusreturnrate.Acompletecross-classificationofthesevari-ableswouldhaveunnecessarilyincreasedthevariancesof theestimatesduetosmallexpectedsamplesizesinmanyofthepost-strata.Consequently,manyofthedetailedcellswerecombined.IntheUnitedStates,therewere448 potentialpost-stratawhichwerecollapsedto416post-strataonthebasisofsmallobservedsamplesizesorhighcoefficientsofvariation.Thedualsystemestimate.Thedualsystemestimate(DSE)foreachpost-stratumwasdefinedby:
DSEDDCE N eN p MwhereDDwasthenumberofdata-definedpersonsinthecensusatthetimeofA.C.E.matching, 2CEwastheweightedestimateofthenumberofpeopleinthecensus whowerecorrectlyenumerated,N ewastheweightedestimateofthenumberofpeopleinthecensus,N p wastheweightedestimateofthenumberofpeoplefoundbytheindependentA.C.E.collectionprocedures,andMwas theweightedestimateofthenumberofpersonsfoundby theindependentA.C.E.collectionprocedureswhowerematchedtopersonsenumeratedinthecensus.Activity12:Model-BasedEstimationforSmall AreasTiming:February,2001.
Activities1through11weredesignedtoprovideesti-matesofnetcoverageforCensus2000.Theseestimates canservetwopurposes.Onepurposewastoprovideinformationonthequalityofthecensussothatanalystscanmakemoreintelligentuseofthedata,andtohelpthe CensusBureauimproveproceduresforfuturecensuses.Thesecondpurposewastohaveabasisforadjustingthecensuscountsfornetcoverage,ifdeemedappropriate.
ThesamplesizesusedintheA.C.E.providedadequate 2Thedata-definedpersonstermexcludescasestemporarilyremovedfromthecensus.2-6SectionIChapter2AccuracyandCoverageEvaluationOverviewU.S.CensusBureau,Census2000 reliabilityforsuchestimatesfortheU.S.asawhole,andformajorgeographicalareas.However,thesamplesizes weretoosmalltoprovidereliableestimatesformost states,counties,cities,andthethousandsofothermunici-palitiesthatnormallymakeuseofcensusdata.Asa result,model-basedestimationwasusedintheseareas.Model-basedestimationtreatsthecoveragecorrectionfactorsasuniformwithinagivenpost-stratum.Another wayofsayingthisisthatthecoverageerrorrateforagivenpost-stratumisassumedtobethesamewithinallgeographicareas.Thisassumptionisobviouslyanover-simplification,andsmallerrorsareintroduced.However,themodel-basedestimatesprovideaconsistentsetofesti-matesinwhichthesumofthepopulationcountsforsmall areasareequaltothedualsystemestimatesofmuchlargerareas(e.g.,theU.S.total,regions,etc.).Coveragecorrectionfactorswereobtainedbydividingthedualsystemestimatesbythecensuscountsofpersonsinhousingunits.Personsingroupquarterswerenotadjustedfornetcoverage.Coveragecorrectionfactorsfor populationgroupsthatgenerallyhadgoodcoveragewere closeto1.00.Populationgroupswithpoorcoveragehadcoveragecorrectionfactorshigherthan1.00,whilecover-agecorrectionfactorslessthan1.00inapost-stratum occurredwhenerroneousenumerationsratesinthe censusexceededomissionrates.Acoveragecorrectionfactorwascalculatedforeachpost-stratum.Ifapost-stratumwasestimatedtohavemorepersonsthanthecensuscount,withineachblockarandomsampleoftheappropriatesizeofcensuspeopleinthepost-stratumwasselected.Thedataoftheselected peoplewerereplicatedintheirblockswithaweightof+1.Ifapost-stratumwasestimatedtohavefewerpeoplethanthecensuscount,withineachblockarandomsample oftheappropriatesizeofpeopleinthepost-stratumwasselected.Thedataoftheselectedpeoplewerereplicatedintheirblockswithaweightof-1.Underthisprocedure noreporteddataforanyindividualwasremovedfromtheCensus2000datafiles.Acontrolledroundingprocedurewasusedtoproduceinteger-valuedmodel-basedesti-matesatvariousgeographiclevels.Estimatesweremadeatvariouslevelsbyaggregatingthedatafromtheappropriateblocksand/orpost-strata.SectionIChapter22-7AccuracyandCoverageEvaluationOverviewU.S.CensusBureau,Census2000
/
Chapter3.DesignoftheA.C.E.Sample INTRODUCTIONTheA.C.E.sampledesignwasamultiphase,nationalsampleof301,000housingunits.Itsdevelopmentwas heavilyinfluencedbyitsplannedpredecessor,theInte-gratedCoverageMeasurementsurvey(ICM).InitialplansforCensus2000wereforaone-numbercensuscorrected forcoveragebasedontheICM.AprimarypurposeoftheICMwastoproducedirectstateestimatesofcoveragewithsufficientreliabilityforapportionmentpopulation counts.Thiscalledforastate-baseddesignandamuchlargeroverallnationalsampleof750,000housingunits.TheJanuary1999SupremeCourtrulingagainsttheuseofsamplingforapportionmentresultedinachangeofplansfortheCensus2000coveragesurveyforwhichthepri-marygoalbecametheproductionofreliablenationalcen-suscoverageestimates,andofselectedsub-populations.Thisdidnotrequireaslargeasample.TheA.C.E.sampledesignwasderivedfromtheICMsampledesign.BythetimethechangeofplansfortheCensus2000coveragesurveyoccurred,manyoperational plansfortheICMweretoofaradvancedtomakesignifi-cantchangesrequiredforanewlyconceivedsampledesignplan.Theimplementationplansandsoftwaresys-temsforcreationofthesamplingframeandselectionoftheICMsampleweremovingalongandalmostreadytostart.Muchofthefieldofficeinfrastructureandstaffing wasbeingputinplaceforthefirstfieldoperationunder theICMsampleplan.Itwascriticaltoproceedasplannedinordertomeetschedules.TheA.C.E.samplingplanwasthusdevelopedasamul-tiphasedesign.ThemuchlargerICMsamplewasfirstselected.Fieldstaffcanvassedthesampleareastocreateanindependentaddresslist.Then,usingupdatedmea-suresofsizefromthefieldcanvass,theICMsamplewasre-stratifiedandreducedwithdifferentialprobabilitiesofselectiontocreatetheA.C.E.sampledesign.SectionsontheA.C.E.sampleanditsdesignaredirectedtoageneralaudience.TheyprovideresultsoftheA.C.E.samplealongwithabroadoverviewofthesampledesign.Latersectionsofthischapterprovideamorein-depthdescriptionoftheA.C.E.designandareavailableforread-erswhodesiregreaterdetail.A.C.E.SAMPLEOVERVIEWANDRESULTSTheA.C.E.consistedoftwoparts.ThePopulationSample,Psample,andtheEnumerationSample,Esample,havetraditionallydefinedthesamplesfordualsystemestima-tion.BoththePsampleandtheEsamplemeasuredthe samehouseholdpopulation.However,theP-sampleopera-tionswereconductedindependentofthecensus.TheEsampleconsistsofcensusenumerationsinthesame sampleareasasthePsample.Aftermatchingwiththecensuslistsandreconciliation,thePsampleyieldsanesti-matedrateatwhichthepopulationwasmissedinthecen-suswhereastheEsampleyieldsanestimatedrateat whichenumerationswereerroneouslyincludedinthecen-sus.CombiningthemyieldsanA.C.E.estimateofnetcen-suscoverageofthehouseholdpopulation.TheAccuracyandCoverageEvaluationhadthreesampling phases: 1.First-phasesample.TheselectionoftheICMsample,comprisingalargenumberofsampleareasforwhichalistofhousingunitaddresseswascreatedindependentofthecensus.
2.Second-phasesample.Thereductionofthefirst-phasesamplewhichresultedintheA.C.E.sampleareas.3.Third-phasesample.ThereductionofhousingunitsbysubsamplingwithinunusuallylargeA.C.E.sampleareas.Table3-1summarizestheA.C.E.samplesizeaftereachphaseofsamplingfortheUnitedStates.Thedatesgiveninthetablearetheproductiondates.Thehousingunit countsareapproximate,basedonthebestknowninfor-mationatthetimeoftheparticularsamplingphase.SectionIChapter33-1DesignoftheA.C.E.SampleU.S.CensusBureau,Census2000 Table3-1.Census2000A.C.E.SampleSizesbySamplingPhaseStartandfinishdateSamplingphaseSampleareasEstimatedhousingunitsMarch,1999thruJune,1999First-phase29,1361,989,000December,1999thruFebruary,2000Second-phase11,303844,000 April,2000thruMay,2000Third-phase11,303301,000within-clusterreduction3,153106,000 nowithin-clusterreduction8,150195,000SURVEYCHARACTERISTICSANDTHEA.C.E.SAMPLEDESIGNMainCharacteristicsoftheA.C.E.Sample TheA.C.E.sample:*Isaprobabilitysampleof301,000housingunitsin11,303sampleareasfortheUnitedStates.*YieldsestimatesofnetcensuscoverageofpersonsinhouseholdsandhousingunitsforthenationexcludingRemoteAlaska.*HasindependentsamplesineachstateandtheDistrictofColumbia,buttherearenostate-baseddesign criteria.*Hastotalstatesamplesizesroughlyproportionaltopopulationsizewiththeexceptionthatthesmaller stateshaveadditionalsample;thesesmallerstateshavesimilarsamplesizes.*Usessomedifferentialsamplingwithinstatesforareasthatmaycontributedisproportionatelytototalvariance orhavehigherconcentrationsofhistoricallyunder-countedpopulationgroups.*HasaseparatesampleofAmericanIndianReservationandotherassociatedtrustlands.*Usesupdatedmeasuresofsizeateachphaseof sampling.*Balancesoperationallimitationssuchasfieldworkloadsandstatisticalissuessuchasweightvariation.OverviewoftheDesignTheA.C.E.usesamultiphasesampletomeasurethenetcoverageforthehouseholdpopulationinCensus2000.Thenationalsample,301,000housingunitsin11,303 sampleareas,wasdistributedamongthe50statesandtheDistrictofColumbiaroughlyproportionaltopopula-tionsizeexceptforthesmallerstatesthathadtheir samplesincreased.Primarysamplingunit.TheblockclusterwasthePri-marySamplingUnit(PSU)fortheA.C.E.Eachblockcluster consistedofoneormoregeographicallycontiguouscen-susblocks.Eachblockclustercontainedonaverage30housingunits,whichwasanefficientinterviewerwork-load.Animportantblockclustercharacteristicwaswell-defined,physicalboundaries.Ambiguousblockclusterboundariescouldpotentiallyleadtoerrorsofomissionor erroneousinclusionintheA.C.E.sample.PhasesoftheA.C.E.sample.ThreephasesoftheA.C.E.samplingwere:1.Selectionofaninitialsampleofapproximately30,000blockclustersforwhichthefieldstaffdevelopedan independentlistofhousingunitaddresses.2.Selectionfromtheinitialsampleresultsofasub-sampleofblockclustersfortheA.C.E.samplebasedontheresultsoftheindependentlist.3.Selectionofasubsampleofhousingunitswithinlargeblockclusters.Firstphaseconsistedoftheselectionofasystem-aticsampleineachstate.InthefirstphaseoftheA.C.E.sampling,blockclustersineachstatewereclassi-fiedbysizeintofourmutuallyexclusivegroupsknownas samplingstrata:(1)clusterswith0to2housingunits(smallstratum),(2)clusterswith3to79housingunits(mediumstratum),(3)clusterswith80ormorehousing units(largestratum),and(4)clustersonAmericanIndianReservationswiththreeormorehousingunits(AmericanIndianReservationstratum).Blockclusterswith80or morehousingunitswereselectedwithhigherprobability thanmediumclustersinthisphasebecausehousingunitsinlargeclustersweresubsampledinalateroperation,bringingtheoverallprobabilityofselectiontheinverseof thesamplingweightforhousingunitsintheseclustersmoreinlinewiththeoverallselectionprobabilitiesofhousingunitsinmediumclusters.Withineachsampling stratum,clustersweresortedandasystematicsamplewasselectedwithequalprobability.SecondphaseinvolvedthereductionoftheICMfirst-phasesampletotheleveldesiredforthe
A.C.E.Inthesecondphase,theblockclustersfromthemediumandlargesamplingstratawerere-stratifiedbasedontheestimateddemographiccompositionoftheblock clustersandtherelationshipbetweenthehousingunitcountfromtheindependentlistandtheJanuary2000updatedcensusaddresslist.Thiswasdoneseparatelyfor3-2SectionIChapter3DesignoftheA.C.E.SampleU.S.CensusBureau,Census2000 themediumandlargestratawithineachstate.Thesesub-strataarereferredtoasreductionstrata.Withineach reductionstratum,theclustersweresorted,andasystem-aticsamplewasselectedwithequalprobabilitywithin eachreductionstratum.Thisreductionuseddifferent selectionprobabilitiesacrossthereductionstratawithina stateandacrossstates.Next,usinghousingunitcountsfromtheindependentlistandtheJanuary2000updatedcensusaddresslist,thesmallblockclusterswerestratifiedwithineachstateby size,andsystematicsampleswereselectedfromeachstratumwithequalprobability.Allclustersfromthesmallsamplingstratumwith10ormorehousingunitsbasedon theupdatedinformationwereretained.AllclustersfromthesmallsamplingstratumthatwereonAmericanIndianlandaswellasList/Enumerateclusterswerealsoretained.
ThesecondphaseofsamplingwasnotdonefortheAmericanIndianReservationsamplingstratum.Thethirdphaseconsistedofthesamplereductionofhousingunitswithinlargeblockclusters.InthethirdphaseofA.C.E.sampling,asubsampleofhousingunitswasselectedwithinlargeclusters.Ifaclustercon-tained79orfewerhousingunits,allthehousingunits wereincludedintheA.C.E.sample.Inclusterswith80ormorehousingunits,asubsamplewasselectedtoreducethecostofdatacollection.Thisphaseofsampling resultedinlowervariationofselectionprobabilitiesfor housingunitswithinthesamereductionstratumbecausethelargeclustershadahigherprobabilityofselectionatthefirstphase.Thissubsamplingwasdonebyforming groupsofadjacenthousingunits,calledsegments.Asys-tematicsampleofsegmentswithineachclusterwasselected.Allhousingunitsintheselectedsegmentswere includedintheA.C.E.sample.ThePsampleandtheEsample.ThePsamplecon-sistedofthehouseholdsusedfortheA.C.E.interviewsthatwereconductedintheseselectedblockclustersand blockclustersegments.TheEsamplewasthesetof censusenumerationsinthesesameblockclustersandblockclustersegments.MeasuresofSizeAsstatedearlier,theA.C.E.sampledesignusedupdatedmeasuresofsizeateachphaseofsampling.First-phasesample.Theblockclustermeasureofsizeforthefirst-phasesamplewasbasedonpreliminarycen-susfilesexistinginthespringof1999.Ideally,thesourceoftheblockclustermeasureofsizewouldhavebeentheDecennialMasterAddressFile,thebasefileofcensus addressesforthedecennialprograms.However,thefirstversionofthisfilewasnotavailableuntilthesummerof1999,toolateforuseintheblockclustering.Instead,the first-phasemeasureofsizewastypicallythehigherofthepreliminarycensushousingunitcountorthe1990censusaddresscountforablockclustercontainingcity-style addresses,housenumberandstreetname.Forblockclus-terswithnon-city-styleaddresses,themeasureofsize wasthepreliminary2000censushousingunitcount.The rulesfordeterminingwhichhousingunitsontheprelimi-nary2000censusfileswouldeventuallymoveforwardto theDecennialMasterAddressFilehadnotbeendefined, sotheblockclustermeasureofsizewasbasedonarea-sonablesetofcriteria,butnotthefinalset.Second-phasesample.Forthesecondphaseofsam-pling,theblockclustermeasureofsizewasthecountofhousingunitsonthelistofhousingunitaddressescreated independentlyofthecensusinthefallof1999.Thereduc-tionofthemediumandlargeblockclustersusedapre-liminarycountofthesehousingunits,whichwasaclerical tallyofhousingunitsfromthelistingsheets.Thesmallblockclusterreductionusedthecountofhousingunitsfromtheindependentlistingsheetsaftertheaddresses hadbeenkeyed.Forthemostpart,thepreliminaryand thekeyedcountsforeachblockclusterwereidentical,butforsomeclustersthereweredifferences.Usingaprelimi-narycountwasnecessarybecausethemediumandlarge clusterreductionhadtobecompletedbeforethekeyingoftheindependentlistingsheetswasdone.Third-phasesample.ForthethirdphaseofA.C.E.sam-pling,theblockclustermeasureofsizewasthehousingunitcountresultingfromthehousingunitmatchingand follow-upoperation.Thisoperationconfirmedthecountresultingfromtheindependentlistingandremovedanynonexistentaddressesfromthesamplingframe.FIRSTPHASEOFTHEA.C.E.SAMPLEDESIGNThesampleselectionduringthefirstphaseconsistedofthreemajorsteps:1.Definitionoftheprimarysamplingunits.
2.Stratificationandallocationoftheprimarysamplingunitswithineachstate.3.Selectionoftheprimarysamplingunitswithineach state.DefiningthePrimarySamplingUnitThePrimarySamplingUnits(PSUs)fortheA.C.E.wereblockclusters.ThePSUsweredelineatedinsuchawaythattheyencompasstheentirelandareaoftheUnitedStates,exceptforextremelyremoteareasofAlaska.Each blockclusterconsistedofacensusblockorseveralgeo-graphicallycontiguouscensusblocks.Theycontainedanaverageof30housingunits.ThelandareaforeachPSU wasmadereasonablycompactsoitcouldbetraversedbyaninterviewerinthefieldwithoutincurringunreasonable costs.SectionIChapter33-3DesignoftheA.C.E.SampleU.S.CensusBureau,Census2000 Whytheblockcluster?Abasicdesigndecision,whichwasacontinuationfromthe1990Post-EnumerationSur-vey,wasthatthePSUwouldbeablockcluster,asingle blockoragroupofadjacentblocksestablishedforthe collectionofCensus2000information.Theseblocksmay bestandardcityblocksorirregularlyshapedareaswith identifiablepoliticalorgeographicboundaries.Using blockclustersasPSUs,insteadofcountiesorcounty groupsthataremorecommonlyusedinnationalsurveys, improvedtheprecisionconsiderablywithonlyamodestincreaseincosts.AnalternativesampledesignwasconsideredthatwouldhavedefinedPSUsbysegmentingwholeblocksinto smallercomponents(roughlyone-halfofablock.)Thealternativedesignwouldlikelyhaveresultedinreducedsamplingerror,butwasrejectedbecauseitwouldincrease costs(primarilyduetoincreasedmatchingworkloadsandinterviewertravel)andprobablywouldhaveresultedin matchingerrorsduetoproblemsinidentifying(spatially)thePSUboundaries.Goalsofblockclustering.Blockclusterswereformedtomeetbothstatisticalandoperationalgoals.IntheCen-sus2000DressRehearsal,asmallcensusblockwasby definitionasingleblockcluster.Thisruleledtoalarge numberofsmallblockclustersthatcouldpotentiallyexertundueinfluenceonthefinalpopulationandvarianceesti-mates.OnefeatureofblockclusteringundertheCensus 2000A.C.E.procedurewastocombinesmallcensusblockswithadjacentcensusblocks,iftheneighboringblockcontainedoneormorehousingunits.Thischangein thetreatmentofsmallcensusblockshadanenormous impactonthenumberofsmallblockclusters,whichwasreducedbyapproximately65percentasseeninTable3-2.Still,manyblockclusterscontainedzerohousingunits.
Roughly70percentofthezerohousingunitblocksoccurredinsparselypopulatedareas.Withoutpopulatedneighboringblocks,thesezerohousingunitblocksremainedstand-alonezeroblockclusters.Thetwooperationalgoalsofformingblockclustersweretoincreaselistingefficiencyandtoreducethechanceof listingerror.Thefirstgoalwasmetbycollapsingcensus blockstoproduceblockclustersthatweregeographically compactandwhichaveragedabout30housingunits,a manageableworkload.Thesecondgoalwastocreate blockclustersthatwerewelldefinedtominimizethe chancethattheclusterwouldbelistedincorrectly.For example,alistingerrormayresultwhenacensusblock hasaninvisibleornonphysicalboundarysuchascitylim-itsmakingitunclearwheretheblockboundarywas.Asa result,censusblocksseparatedbyinvisibleboundaries werealwayscombined.
Limitations.Asmentionedearlier,theblockclustermeasureofsizeforthefirstphasewasbasedonprelimi-narycensusaddresscounts.Somecensusoperationsthathelpedbuildthecensusaddresslistwerenotavailableat thetimeblockclusteringstarted.Instead,asnapshotofthebestknowninformationwasused.Thispresentedsomelimitationswiththedatausedforblockclustering.*Addresslimitations:TheresultsoftheBlockCanvassingandLocalUpdateofCensusAddresses(LUCA)opera-tionswerenotincorporatedintothecensusaddresslist intimeforblockclustering.BlockCanvassingwasaCensus2000fieldoperationinmailout/mailbackareas(mostlycity-styleaddresses).TheCensusBureausent staffintothefieldtocanvasstheirassignmentareasandprovideupdatestotheaddresslistsuchascorrec-tions,adds,ordeletes.LocalUpdateofCensus AddresseswasalsoaCensus2000programthatpro-videdanopportunityforlocalandtribalgovernmentstoreviewandupdateaddressinformationinthecensusaddresslist.Table3-2.AccuracyandCoverageEvaluation:BlockClusterSummaryStatistics 1PreliminarynumberofhousingunitsTotal0-23-7980+Numberofcensusblocks 2........................2,969,0004,009,000245,0007,223,000Numberofblockclusters
..........................1,029,0002,486,000252,0003,767,000Numberofblockspercluster 3.....................1.32.21.51.9Numberofhousingunitspercluster................0.329.2181.931.5 1TheUnitedStatesandPuertoRicoareincludedinthesesummarystatistics.
2Countofcensuscollectionblocksbeforeclusteringandbeforeblocksuffixing.DoesnotincludewaterblocksorcensusblocksinRemoteAlaska.
3Thesenumbersarenotthefirstrowdividedbythesecondrow.Theyarethenumberofcensusblocksineachblockclustersizecategorydividedbythenumberofblockclustersineachcategory.Forexample,iftwocensusblockswith40housingunits collapsetoforman80housingunitblockcluster,thosetwocensusblocksarecountedinthe80+categoryforthenumberof blocksperclustercomputation.Blockclusteringcancombineacrosscategories;therefore,thefirstandsecondrowsarenot consistent.3-4SectionIChapter3DesignoftheA.C.E.SampleU.S.CensusBureau,Census2000
- Geographiclimitations:EachblockinthecensusaddresslisthadaTypeofEnumerationArea(TEA) assignment.ForCensus2000,TEAisaclassification thatidentifiedboththecensusenumerationmethodand themethodusedtocompilethecensusaddresslist.The blockclusteringoperationoccurredconcurrentlywith thecensusreviewofTEAassignmentstoensurethe mostcompletecoverageofthearea.Thisreviewpro-cesssometimeschangedtheTEAassignmentofblocks aftertheblockclusterwasdefined.Onafewoccasions, thisresultedinablockclusterconsistingofblocksthat haddifferentmethodsforcompilingthecensusaddress list.Forexample,ablockclusterconsistedofthree blocks,andallthreeblockshadaTEAassignmentof BlockCanvassingandMailout/Mailbackatthetimeof blockclustering.AfterthecensusTEAreview,oneof thoseblockswasconvertedtoanAddressListingand Update/LeaveTEAassignment.Foracompletelistof TEAsforCensus2000,seetheattachmentorvisit http://www.geo.census.gov/mob/homep/teas.html.Generalrulesfordefiningblockclusters.*BlockclusterswereformedbycombiningneighboringCensus2000blocks.*Blockclustersdidnotcrossspecificgeographicalboundaries.Amongthesewerecounty,interimcensus tract,LocalCensusOffice,TEAgroup,militaryarea,andAmericanIndianCountry.ForTEAgroups,blocksfromcertainTEAscouldbeclusteredtogetheriftheTEAshad thesamemethodforcompilingtheaddresslist.Ameri-canIndianCountryrefers,collectively,tolandsthatareAmericanIndianReservationorothertrustlands,tribal jurisdictionstatisticalareas(nowknownasOklahomaTribalStatisticalAreas),tribaldesignatedstatisticalareas,andAlaskanativevillagestatisticalareas.*Blocksseparatedbyaninvisibleboundary,acityline,forexample,wereclusteredexceptforthesituations describedabove.*Wheneverpossible,smallcensusblocks,thosewithfewerthanthreehousingunits,wereclusteredwith neighboringcensusblockscontaininghousingunitstoreducethetotalnumberofsmallblockclusters.Iftherewerenoneighboringcensusblockwithhousingunits, thesmallcensusblockwasaclusterbyitself.*Topreventblockclustersfrombecomingtoolargewithrespecttohousingunitsize,censusblockswith80ormorehousingunitsweregenerallynotclusteredwithothercensusblocks.*Inadditiontothecriteriaofunitsize,anyblocklargerthan15squaremileswasgenerallyablockclusterby
itself.Theserulesproduced3.8millionblockclusters,abouthalfthe7.2millionnon-suffixedcensusblocks.Theblockclus-tershadanaverageof29.2housingunitspermediumblockclusterandanaverageof31.5overall.Thenumberofsmallblockclustersalsodecreasedfromnearlythree milliontoaboutonemillion,anapproximate65percent reductionfromtheCensus2000DressRehearsalrulesof definingasmallblocktobeaclusterbyitself.However, sinceabout70percentofsmallblocksoccurredinless populatedareaswithlittleornopopulationtocombine, manysinglezero-housingunitblockclusterswereformed.StratifyingandAllocatingthePrimarySampling UnitsStratifyingthefirst-phasesample.Priortosampling,blockclusterswerestratifiedaccordingtotheexpectednumberofhousingunitsandtheAmericanIndianReserva-tion(AIR)statusoftheblockcluster.ThefoursamplingstrataandtheirdefinitionsarepresentedinTable3-3.Allocatingthefirst-phasesample.Asstatedearlier,theCensusBureauwaspreparingtoconducttheICM,amuchlargercoveragemeasurementsurveyof750,000housingunits,whentheuseofsamplingforapportion-mentcountswasdisallowedbytheSupremeCourtinJanuary,1999.Tokeepthecoveragemeasurementsurvey onschedule,theCensusBureauwentaheadwiththeplanstoselecttheICMsampleandcreateindependentaddresslists.Thiswasfollowedbythesubsamplingofthefirst-phasesampletoproducetheA.C.E.sampledesign.Thefirst-phasesamplingplanwasanationalsampleof30,000blockclusters:25,000mediumandlargeblockclustersand5,000smallblockclusters.Includedinthe 25,000blockclusterswasaseparatesampleofblock clustersforAmericanIndianReservations.Itisimportanttopointoutthattheallocationofthe25,000mediumandlargeblockclusterswasdependent ontheICMsampledesignandundertheassumptionofroughly30housingunitsperblockcluster.Theallocationofthe5,000smallblockclusterstothestatesandthe separateAmericanIndianReservationsampletothestateswasdonepriortodefiningblockclustersforallstates,sincethefirst-phasesamplingwasdoneonastate-by-stateflow-basis.Thismeansthatthefirst-phasesamplewasselectedforsomestatesbeforetheblockclustershadbeendefinedforotherstates.Asaresult,weusedthe bestinformationwehadatthetimetocarryoutthe
allocation.Mediumandlargeblockclusters.The25,000mediumandlargeblockclusterswereallocatedtothestatestomeettheICMsamplerequirements(Schindler,1998)withsomeminormodifications.Moststateshad between300to500blockclustersandtheverylargest stateshadanallocationofbetween1,000and2,000block clusters.SectionIChapter33-5DesignoftheA.C.E.SampleU.S.CensusBureau,Census2000 Table3-3.First-PhaseSamplingStrataFirst-phasesamplingstratumDefinitionSmall0to2housingunitsMedium3to79housingunits Large80ormorehousingunits AmericanIndianReservation3ormorehousingunitsandonAmericanIndianReservationsWithineachstate,theblockclustersamplewaspropor-tionallyallocatedtothemediumandlargesamplingstrata basedonthenumberofhousingunitsinthesampling
stratum: c state,kC stateH state,k H statewhere,k=mediumorlargesamplingstratum; c state,k=targetnumberofclustersinsamplingstratumkwithinstate; C state=targetnumberofA.C.E.first-phasemediumandlargesampleclustersfor
state;H state,k=numberofhousingunitsinsamplingstra-tumkwithinstate; H state=numberofhousingunitsinthemediumandlargestratainstate.Asanexample,letssaythat402totalmediumandlargeblockclusterswereallocatedtoaparticularstate.Assum-ingthatthereareanexpected9,000housingunitsinallclustersinthemediumsamplingstratumand12,060housingunitsinboththemediumandlargesampling strata,thetargetnumberofclustersfromthemediumsamplingstratumforthestateiscalculatedasfollows:
Cstate,medium4029,000 12,060300.Thetargetnumberofclustersfromthelargesamplingstratumwouldthenbe102.Smallblockclusters.Becauseofcostconsiderations,smallblockclustersweregenerallysampledatalowerratethaneithermediumorlargeclusters.Anoverallallo-cationof5,000smallblockclusterswaschosenbecausea totalof30,000blockclusterswasdeemedmanageableforcreatingindependentaddresslists.Thehighweightsresultingfromthelowersamplingrateswerenotexpected tohaveaseriousimpactontheestimatesorvariancesformostclustersselectedfromthesmallblockclustersam-plingstratum.However,forclustersthatwereinitiallyclassifiedassmall,butwereobservedtohavealargernumberofhousingunits,therewasconcernabouthighsamplingweightsdisproportionatelycontributingtovari-ance.Inanattempttoavoidtheproblemsassociatedwiththehighweights,alargernumberofsmallclusterswasinitiallyselected,followedbyanindependentaddresslist, followedbyasubsampletoremaininsample.Usingupdatedmeasuresofsizeforthose5,000smallblockclustersinthesmallclusterreductionhelpedtotarget clustersthatcouldhavecontributeddisproportionatelytothevariance.Theseinitial5,000smallclusterswereallo-catedtostatesproportionatelytotheirestimatedtotal numberofhousingunitsinsmallblocks.Ideally,wewouldhaveallocatedthe5,000blockclustersproportionallytostatesbasedonthenumberofsmallblockclustersinthestate.Thiswasnotpossiblebecause thefirst-phasesamplingwasdoneonaflowbasis.AmericanIndianReservationblockclusters.
ToensuresufficientsampleforcalculatingreliablecoverageestimatesforAmericanIndianslivingonreservations,weallocated355blockclusterstoAmericanIndianReserva-tionsnationwide.The355clusterswereallocatedto26 statesproportionaltothe1990populationofAmericanIndianslivingonreservations.SmallblockclustersonAmericanIndianReservationswerenotincludedinthese 355blockclusters.Theseclusterswereeligibleforselec-tioninthesmallclusterstratum.BlockclusterswithinstatescontaininglittleornoAmericanIndianpopulation onreservationswererepresentedinthemediumandlarge strata.Thissampleallocationresultedinvariablefirst-phaseselectionprobabilitiesacrossthestatesdespiteourgoalofhavingproportionalallocationoftheAmericanIndianRes-ervation(AIR)sample.Thisoccurredbecausetheaverage numberofhousingunitsperAmericanIndianReservationblockclustervariedacrossstates.Togetsimilarfirst-phaseselectionprobabilities,weneededtohaveallofthe blockclusteringcompletedbeforeallocatingthesample.However,thefirst-phasesamplingwasdoneonaflow basis.3-6SectionIChapter3DesignoftheA.C.E.SampleU.S.CensusBureau,Census2000 SelectingthePrimarySamplingUnitsWithinEach StateCalculationofthesamplingparameters.Theblockclusterprobabilityofselection(PS)foreachofthefour samplingstrataineachstateistheratioofthetarget samplesizetothenumberofclustersinthestratum.It takesthefollowingform:
PS state,kc state,kL state C state,k ,where, PS state,k=probabilityofselection(samplingrate)insamplingstratumkwithinstate; C state,k=numberofclustersinsamplingstratumkwithinstate; c state,k=targetnumberofclustersinsamplingstra-tumkwithinstate; L state=thefactortoreducethenumberofclusterstoselectforthestate,iftheexpectedlisting workloadexceededtheplanningestimate.
L state{1forsmall,mediumandAIRsamplingstratum0<L1forlargesamplingstratumThelargeblockclustersamplingratewasreducediftheexpectednumberofhousingunitstolistwasgreaterthantheplanningestimateofthelistingworkload.AsecondstepofsamplingwasnecessaryinMissouriandIndiana becausetheselectedsampleofclustersresultedinagreaternumberofhousingunitstolistthanwasexpected.Tomeetoperationalconstraints,asubsampleofthefirst-stepselectedblockclusterswasselected.Thesecondstepofsamplingonlyoccurredinthelargesamplingstratum,sincethatstratumdisproportionatelycontributedtothe listingworkload.Thesecondstepoccurredonlyiftheesti-matednumberofhousingunitsinthemediumandlargestratawasatleasttenpercentlargerthantheplanningestimateofthenumberofhousingunitstobelisted.Forstatesneedingthesecondstepofsampling,thesam-plingratetookthefollowingform:
PS2 statePW state W statewhere, PS2 state=second-stepsamplingrateforthelargesamplingstratuminstate, W state=resultingworkloadestimatefromsampleselectionforthelargesamplingstratuminstate, PW state=planningworkloadestimateforthelargesamplingstratuminstate.SortingthePSUs.Thefirst-phaseclustersweresortedwithineachsamplingstratumasfollows:*AmericanIndianCountryIndicator*Demographic/TenureGroup*1990Urbanization*Countycode*BlockclusteridentificationnumberAlthoughtherewasnodifferentialsamplingwithinthefourfirst-phasesamplingstrata,theclustersweresortedbyseveralvariablesinanattempttoimprovetherepre-sentativenessofthesampleofblockclusters.ThefirstvariablewastheAmericanIndianCountryIndicator,whichseparatedtheblockclustersintothreeAmericanIndian
categories:1.AmericanIndianReservationorothertrustland,2.tribaljurisdictionstatisticalarea,Alaskanativevillagestatisticalareaortribaldesignatedstatisticalarea,and3.allotherareas.Thesecondsortvariablewasthedemographic/tenuregroup.Blockclusterscontainingsimilardemographic/tenureproportions,basedon1990censusdata,were grouped.Toaidinselectingasamplethatwaswellrepre-sentedbythesixmajorrace/origingroups,aswellasownersandrenters,blockclusterswereclassifiedinto12 demographic/tenuregroups.Althoughmanyblockclus-terstendtohavealargeproportionofonedemographic/tenuregroup,rarelyweretheyentirelycomposedofonly one,thusmanyclustersfitwellintwoormorecategories.
Toensurethateachclusterwasassignedtoonlyonegroup,ahierarchicalassignmentrulewasdevelopedsothatwhenaclusterexceededthefirstgroupthreshold,it wasassignedtothatgroup.Thesethresholdswerebasedonamultivariateclusteringmethodappliedto1990cen-susblocks.Table3-4liststhesethresholdvalues.Thehier-archygivesthesmallerdemographicgroupspriorityover thelargeronesandrenterspriorityoverowners.Forexample,iftheapproximatedistributionofablockclusterpopulationwas20percentAsianRenter,40percentAsian Owner,and40percentWhiteandotherRenter,thentheblockclusterwasassignedtotheAsianRenterdemographic/tenuregroup.SectionIChapter33-7DesignoftheA.C.E.SampleU.S.CensusBureau,Census2000 Table3-4.Demographic/TenureGroupThresholds(50Statesandthe DistrictofColumbia)OrderDemographic/TenureGroupThreshold1HawaiianandPacificIslanderrenters10%2HawaiianandPacificIslanderowners10%
3AmericanIndianandAlaskaNativerenters10%
4AmericanIndianandAlaskaNativeowners10%
5Asianrenters20%
6Asianowners20%7Hispanicrenters20%8Hispanicowners20%
9Blackrenters25%10Blackowners25%11Whiteandotherrenters30%12AllothersallothersAthirdsortvariablewastheestimatedlevelofurbaniza-tionbasedon1990dataforeachblockcluster.Eachblockclusterwascategorizedeitherasanurbanizedareawith250,000ormorepeople,anurbanizedareawithlessthan 250,000people,oranon-urbanarea.Andfinally,theclus-tersweresortedgeographicallyusingcountyandclusternumber.Generalsamplingprocedure.Asystematicsampleofblockclusterswasselectedfromeachsamplingstratumwitheachblockclusterhavingthesameprobabilityofselectionwithinasamplingstratum.Themethodusedtoselectsystematicsamplesfollows:1.SamplingunitsweresortedusingthePSUsortcriteriadescribedateachsamplingphase.2.EachsuccessivePSUwasassignedanindexnumber1throughNwithineachsamplingstratumwhereNis thenumberofPSUsinthestratum.3.Arandomnumber(RN)betweenzeroandone,0<RN1,wasgenerated.4.Arandomstart(RS)forthesamplingstratumwascal-culated.Therandomstartwastherandomnumbermultipliedbytheinverseoftheprobabilityofselec-tion,RS=RN1/PS,suchthat0<RS1/PS.5.Samplingsequencenumberswerecalculated.GivenNPSUs,sequencenumberswere:RS,RS+1(1/PS),RS+2 x(1/PS),...,RS+n(1/PS)wherenwasthelargestintegersuchthat
[RS+(n-1)1/PS]N.Sequencenumberswereroundeduptothenextinteger.Anintegernumberroundedtoitself.6.SamplingsequencenumberswerecomparedtotheindexnumbersassignedtoPSUs.ThePSUwiththe indexnumbercorrespondingtotheroundedsequencenumberwasselected.AllPSUswithoutcorrespondingindexnumberswerenotinsample.First-PhaseSampleResultsTable3-5liststheblockclustersamplesizesandthenum-berofhousingunitsbysamplingstratumforeachstate,theDistrictofColumbia,andthenation.3-8SectionIChapter3DesignoftheA.C.E.SampleU.S.CensusBureau,Census2000 Table3-5.StateFirst-PhaseSampleResultsbyFirst-PhaseStratum StateFirst-phasehousingunits 1First-phaseblockclustersSmallMediumLargeAIRTotalSmallMediumLargeAIRTotal Alabama......................607,90019,000026,9601162861090511 Alaska........................205,20023,2002028,440201901371348 Arizona.......................207,80044,7002,60055,12086269180113648 Arkansas.....................409,60015,900025,540903531010544 California.....................5045,000227,600230272,8801841,4421,311112,948 Colorado......................208,00025,6006033,680832931572535 Connecticut...................106,10025,600031,710202111590390 Delaware.....................207,20028,700035,920202431560419DistrictofColumbia............104,80050,500055,310201322470399Florida........................507,50050,1003057,6801452592301635 Georgia.......................706,10030,300036,4701542201620536 Hawaii........................103,00042,400045,410201031610284 Idaho.........................108,20010,90014019,25054312756447Illinois........................1008,60022,300031,0001852811400606Indiana.......................806,1009,700015,880140202510393 Iowa.........................1206,8009,500016,420147242530442Kansas.......................1106,40011,1003017,640193237631494 Kentucky.....................607,20022,300029,560962681350499 Louisiana.....................1011,30024,900036,210654071550627 Maine........................205,80011,0001016,83038226791344 Maryland.....................205,30038,000043,320361771750388 Massachusetts................206,40022,000028,420382291400407Michigan......................507,90015,10015023,2001222681045499Minnesota....................706,00014,00027020,3401412088310442 Mississippi....................408,40011,70012020,26081303773464 Missouri......................1105,70014,500020,310162200710433 Montana......................108,4009,70084018,950673336724491Nebraska.....................806,8007,7007014,650142245553445 Nevada.......................106,40057,80019064,400462252305506NewHampshire...............205,70015,400021,120252011060332NewJersey...................108,70030,100038,810392821780499NewMexico...................109,30024,8001,64035,75010833513670649NewYork.....................8017,600124,70070142,45014360363151,382NorthCarolina.................1006,70020,7008027,5801432361214504NorthDakota..................1005,9009,10034015,4401212366412433 Ohio.........................1107,80024,000031,9101322681330533 Oklahoma.....................609,00017,30027026,6301423141018565Oregon.......................105,20015,4007020,68086195903374 Pennsylvania..................11012,90022,600035,6101804271460753RhodeIsland..................107,60018,000025,610202561080384SouthCarolina................408,20019,100027,340952851120492SouthDakota.................505,8009,20045015,5001062425727432Tennessee....................907,80025,400033,2901332851370555Texas........................7034,700148,50030183,3003491,22268112,253 Utah.........................109,10023,90012033,130383121447501Vermont......................205,60012,000017,62021201880310Virginia.......................605,60031,900037,56098961660460Washington...................205,60021,40048027,5007318712017397WestVirginia..................305,00013,100018,13046189790314Wisconsin.....................806,2008,20022014,7001192115810398Wyoming.....................108,7009,2009018,00072346695492TotalU.S......................2,400438,6001,539,8008,6201,989,4205,00015,3938,38835529,136 1Preliminarycensusaddresslisthousingunitcountsfromspring1999.SECONDPHASEOFTHEA.C.E.SAMPLEDESIGNThesecondphase,oftenreferredtoastheA.C.E.reduc-tionphase,linkedthefirst-phasesampleselectiontotheA.C.E.samplingplan.TheA.C.E.reductionwasthefirstofseveraloperationsthatreducedthenumberofhousing unitsfromthenearlytwomillionhousingunitsintheindependentlistingtotheapproximately300,000housingunitsthatweresentforinterview.Sincenotallofthefirst-phaseblockclusterswererequiredforA.C.E.,thereduc-tionsubsampledthoseclusters,withtheselectedclusters retainedfortheA.C.E.operations.FollowingtheselectionoftheA.C.E.first-phasesample,fieldstaffvisitedtheblockclustersandcreatedaninde-pendentaddresslistforA.C.E.TheseupdatedhousingSectionIChapter33-9DesignoftheA.C.E.SampleU.S.CensusBureau,Census2000 unitcountswereusedintheclustersubsamplingphase.Theclustersubsamplingwasdoneseparatelyfor:*mediumandlargeclusterreduction,and
- smallblockclusterreduction.MediumandLargeClusterReductionThemediumandlargeclusterreductionwasthetransitiontotheA.C.E.samplingplan.Theresultingnationalsampleallocationwasroughlyproportionaltostatepopulationwithsomedifferentialsamplingwithinstates.Onlyblock clustersfromthemediumandlargefirst-phasesamplingstratainthe50statesandtheDistrictofColumbiaweresubsampledinthisphase.Aspartofthesamplereduc-tion,twootherobjectivesoftheA.C.E.samplewereimple-mented.Oneobjectiveofthemediumandlargeclusterreductiondesignwastostratifythefirst-phaseclustersbasedontherelationshipofcurrenthousingunitcountsfromtheA.C.E.independentlistingandtheupdatedcensusaddresslistas ofJanuary,2000.Clustersweresampledwithdifferentselectionprobabilitiesinordertoreducethevariancecon-tributionduetoinconsistenthousingunitcountsbetween theupdatedcensuslistandtheindependentlist.Clusters withsignificantdifferencesbetweenthecountswereexpectedtohavehigherroneousenumerationandhighomissionrates.Theobjectiveofdifferentiallysampling thesetypesofclusterswastoreducethesamplingweightsassociatedwithclustershavingrelativelyhighnumbersofmissedpersonsorthoseenumeratedinerror, and,thus,havingpotentiallyhighvariancecontributions.Asecondobjectiveofthemediumandlargeclusterreduc-tiondesignwastodifferentiallysampleclustersbasedon theestimateddemographiccompositionofthecluster.ClusterswithahighproportionofpersonsofHispanicori-ginorpersonsbelongingtoacensusracegroupother thanWhitewereclassifiedintoaminoritystratum.These typesofclustersweresampledatahigherratethanpre-dominantlynon-HispanicWhiteclusters,inordertoincreasethesamplesizeandimprovethereliabilityofthe A.C.E.populationestimatesforthesehistoricallyunder-countedsubgroups.Stratifyingsecond-phaseclusters.Eachblockclusterwasputintotwocategoriesforthemediumandlargeclusterreduction:ademographicgroupandaconsistencygroup.Blockclusterswereputintoreductionstratabased onthecombinationofthesetwogroups.Demographicgroupswerebasedonthedemographic/tenuregroupscreatedinthefirst-phasesampleselection.Thedemographic/tenuregroupsrepresentedaclassifica-tionofblockclusters,usingtheinformationofrace/
Hispanicoriginandtenureofeachblockreportedinthe 1990census.Thedemographic/tenuregroupswereusedasasortvariableintheselectionofthefirst-phasesample.Forthisreduction,clusterswereputintotwo demographicgroupsbycombiningthe12demographic/tenuregroupsinTable3-4.Thetwodemographicgroupsare:1.Minority:blockclustersfromoneofthetenminoritydemographic/tenuregroups2.Non-minority:blockclustersfromoneofthetwootherdemographic/tenuregroupsForthisreduction,twoupdatedclusterhousingunitcountswereused:theindependentlistinghousingunitcountandthehousingunitcountfromtheupdatedcensusaddresslistasofJanuary2000.Thetwohousingunit countswerecompared,andclusterswereplacedintocon-sistencygroupsbasedontherelationshipofthehousingunitcounts.Largedifferencesbetweenthecountsindi-catedthatcoverageproblemsmightoccur;thus,thesam-plingweightsforsuchclusterswerecontrolledtoavoidseriousvarianceeffects.ClusterswereplacedintothreeconsistencygroupsasshowninTable3-6.Table3-6.Second-PhaseSamplingConsistency GroupsRelationshipConsistencygroupIndependentlistisatleast25percentlowerthan census......................................Low inconsistentIndependentlistisatleast25percentgreaterthan census......................................HighinconsistentIndependentlistiswithin
+/-25percentofcensus..ConsistentForList/Enumerateclusters(seeattachment),thecensushousingunitcountwasnotknownatthetime ofthereductionsincethiscensusoperationhadnotstarted.Thus,allsuchclusterswereclassifiedashigh inconsistent.3-10SectionIChapter3DesignoftheA.C.E.SampleU.S.CensusBureau,Census2000 Basedonthedemographicgroup,theconsistencygroup,andtheindependentlistinghousingunitcount,block clusterswereassignedtooneoffivereductionstrata:1.Minority(lowinconsistent,highinconsistent, consistent)2.Non-minoritylowinconsistent 3.Non-minorityhighinconsistent4.Non-minorityconsistent 5.MediumstratumjumperMediumstratumjumperclusterswereselectedfromthemediumsamplingstratumforthefirst-phasesample,but had80ormoreindependentlistinghousingunits.Mediumclustersweresampledatlowerratesthanlargeclustersinthefirst-phasesamplesincelargeclusters eventuallyweretoundergowithin-clusterhousingunitsubsampling,anoperationthatincreasessamplingweights.Mediumstratumjumperclustersalsowent throughwithin-clusterhousingunitsubsampling,meaningthealreadyhighersamplingweightsoftheseclustersbecameevenlarger.Retainingallofthemediumstratum jumperclustersinthisreductionavoidedintroducingsig-nificantweightvariationinthesample.Allocatingsampletostrata.Thefirststepwastoallo-catethenationalsampleof300,000housingunitstothe 50statesandtheDistrictofColumbia,inmostcasespro-portionalto1998populationestimates,withaminimumof1,800housingunitsineachstate.Hawaiiwasallocated approximately3,750housingunitsduetoitsconcentra-tionofHawaiianandPacificIslandersforwhichseparatepopulationcoverageestimateswereplanned.Withineachstate,thesecond-phaseselectionprobabilitiesvariedsomewhatamongthestrata.First,allclustersinthe mediumstratumjumperreductionstratumwereretained.Fortheremainingfourreductionstrata,higherretentionrateswereusedintheminority,non-minoritylowincon-sistentandthenon-minorityhighinconsistentreductionstratathanthenon-minorityconsistentstratum.Thestra-tumdifferentialsamplingfactoristheratiooftheprob-abilityofselectionforthestratumtotheprobabilityof selectionfortheconsistentstratum.Thefollowingstatementsdescribehowthestratumdiffer-entialsamplingfactorsweresettoyieldtheoverallstate samplesize.Thesearenotexactrules,butgiveasenseof howmuchdifferentialsamplingwithinstateswasdone.*Themaximumexpectedsamplingweightafterallsub-sampling,theinverseoftheoverallprobabilityofselec-tion,was650forthenon-minorityconsistentreduction stratum.*Themaximumdifferentialsamplingfactorwas3forthetwoinconsistentreductionstrata.*Thedifferentialsamplingfactorwasaround2fortheminorityreductionstratum,exceptinsmallstates wherealloftheminorityclusterswereretained.Thedifferentialsamplingfactorswereassignedusingguidelinesdesignedtoachievethetwoobjectivesofthereduction,whilealsocontrollingthesizeofthesamplingweightsandtheamountofdifferentialsampling.Thisled tothedesignofthedifferentialsamplingfactorssumma-rizedinTable3-7.Usingthestratumdifferentialsamplingfactorsandtheestimatednumberofhousingunits,thesampleallocationforeachreductionstratumwasderivedasfollows:
T gTDSF gHgg1 4 DSF gHgwhere,g=A.C.E.second-phasesamplingstratum, T g=Targetnumberofsamplehousingunitsallocatedtoreductionstratumg,T=Statetargetnumberofsamplehousingunitsmodifiedformediumstratumjumper clusters, Hg=Estimatednumberofhousingunitsinthereductionstratumbasedontheindepen-dentlistinghousingunitcounts,and DSF g=DifferentialSamplingFactorforreductionstratumg.SectionIChapter33-11DesignoftheA.C.E.SampleU.S.CensusBureau,Census2000 Table3-7.A.C.E.Second-PhaseSampleDesignParametersforLargeandMediumClusters StateDifferentialSamplingFactors 1Target sample size 6 First-phase sample size 7 Minority 2 Low inconsistent 3 High inconsistent 4 Consistent 5 Alabama....................................1.781.781.781.004,47026,960 Alaska......................................6.203.003.001.001,80028,440 Arizona.....................................1.781.781.781.004,80055,120 Arkansas...................................2.003.003.001.002,61025,540 California
...................................2.003.003.001.0033,510272,880Colorado..
.................................1.992.932.931.004,08033,680 Connecticut
.................................2.003.003.001.003,36031,710 Delaware...................................2.913.003.001.001,80035,920DistrictofColumbia..
........................2.003.003.001.001,80055,310 Florida.....................................1.001.001.001.0015,30057,680 Georgia....................................2.012.012.011.007,83063,470 Hawaii......................................2.003.003.001.003,75045,410 Idaho.......................................2.713.003.001.001,80019,250 Illinois......................................1.191.191.191.0012,36031,000 Indiana.....................................1.681.681.681.006,06015,880Iowa.......................................2.003.003.001.002,94016,420 Kansas.....................................2.003.003.001.002,70017,640 Kentucky...................................2.003.003.001.004,05029,560 Louisiana...................................1.893.003.001.004,47036,210 Maine......................................6.553.003.001.001,80016,830Maryland..
.................................1.872.462.461.005,28043,320 Massachusetts
..............................2.332.332.331.006,30028,420 Michigan....................................1.251.251.251.0010,08023,200 Minnesota..................................2.112.112.111.004,86020,340 Mississippi
..................................1.962.832.831.002,82020,260 Missouri....................................2.252.252.251.005,58020,310Montana....................................1.573.003.001.001,80018,950 Nebraska...................................2.443.003.001.001,80014,650 Nevada.....................................1.952.762.761.001,80064,400NewHampshire
.............................6.843.003.001.001,80021,120NewJersey.................................2.242.242.241.008,34038,810NewMexico.................................1.731.731.731.001,80035,750NewYork...................................2.003.003.001.0018,660142,450NorthCarolina
...............................1.831.831.831.007,74027,580NorthDakota
................................2.143.003.001.001,80015,440Ohio.......................................1.221.221.221.0011,49031,910 Oklahoma..................................2.003.003.001.003,42026,630 Oregon.....................................1.942.762.761.003,36020,680Pennsylvania..
..............................1.701.701.701.0012,30035,610RhodeIsland..
..............................2.943.003.001.001,80025,610SouthCarolina
..............................1.601.601.601.003,93027,340SouthDakota
...............................1.833.003.001.001,80015,500Tennessee..................................1.992.862.861.005,58033,290Texas......................................1.862.362.361.0020,280183,300Utah.......................................2.003.003.001.002,16033,130Vermont....................................6.913.003.001.001,80017,620Virginia...
..................................1.901.901.901.006,96037,560Washington
.................................2.232.232.231.005,85027,500WestVirginia
................................2.003.003.001.001,86018,130 Wisconsin..................................1.751.751.751.005,37014,700Wyoming...
................................1.993.003.001.001,80018,000 1Theobservedoractualsamplingfactorsdifferedfromthedesignsamplerates.SeethesectiononSelectingasubsample.
2Clusterswithhighconcentrationsofminorities.
3Clusterswheretheindependentlistinghousingunitcountisatleast25percentlowerthantheupdatedcensuslistcount.
4Clusterswheretheindependentlistingcountisatleast25percenthigherthantheupdatedcensuslist.
5Clusterswheretheindependentlistingcountandtheupdatedcensuslistdonotdifferbymorethan25percent.
6Targetstatehousingunitinterviewsamplesize,excludingAmericanIndianReservationsample.
7First-phasepreliminarycensusaddresslisthousingunitcountsfromSpring,1999.3-12SectionIChapter3DesignoftheA.C.E.SampleU.S.CensusBureau,Census2000 SortingthePSUs.Thefirst-phaseclusterswithineachsecond-phasestratumbyfirst-phasesamplingstratum weresortedasfollows:*Consistencygroup
- List/Enumerateindicator
- AmericanIndianCountryIndicator
- Demographic/TenureGroup
- 1990Urbanization
- Countycode
- BlockclusteridentificationnumberSelectingasubsample.Sincethefirst-phasesampleutilizeddifferentsamplingratesforthemediumandlargesamplingstrata,separatesamplesweredrawnforeach second-phasestratumwithinthefirst-phasesamplingstrata.Selectingthesamplerequiredcalculatingthesam-plingrates,sortingtheclusters,anddrawingasystematic sampleofclusters.Allofthemediumstratumjumperswereretainedinthesample.Thesamplingratesfortheremainingfourreduc-tionstratawerecomputedsothatanintegernumberofblockclusterswasselected.Thisrequiredcomputingasamplingratebasedontheratioofhousingunitswhich resultedinanon-integerexpectednumberofclusters,determininganintegernumberofclusterstoselect,andcalculatingthefinalsamplingratebasedontheratioof clusters.Themediumandlargeclusterreductionfollowed thesamplingprocedurediscussedearlier.Thisresultedinatotalof9,765outof24,136mediumandlargeclustersretainedintheA.C.E.sampleforthe50 statesandtheDistrictofColumbia.Mediumandlargeclusterreductionsampleresults.Table3-8liststhenumberofhousingunitsandclustersin sample.SmallClusterReductionThefirst-phasesamplecontained5,000smallclustersintheUnitedStates.Smallclusterswereexpectedtohave betweenzeroandtwohousingunitsbasedonanearly censusaddresslist.Conductinginterviewingand follow-upoperationsinclustersofthissizewasnotas costeffectiveasinlargerclusters.Therefore,toallocate A.C.E.resourcesmoreefficiently,onlyasubsampleof thesesmallclusterswasretainedintheA.C.E.sample.Thissubsamplingoperationattemptedabalanceamongthreegoals.Onegoalwastopreventanysmallclusters fromhavingsamplingweightsthatwereextremelyhighcomparedtootherclustersinthesample.Second,sam-plingweightsshouldbeloweronclusterswherethenum-berofhousingunitswasdifferentthanexpected.These firsttwogoalsattemptedtoreducethecontributionofsmallclusterstothevarianceofthedualsystemesti-mates.Thethirdgoalwastoimproveoperationaleffi-ciencybyreducingthenumberofclustersandfuturefieldvisits.Toachievethesegoals,differentialsamplingwas used.Stratifyingfirst-phaseclusters.Thefirst-phasesmallclusterswereclassifiedintoninepossiblereductionstrata withineachstate.Thesestrataweredefinedusingthree clustercharacteristics:Size,AmericanIndianCountrysta-tus,andList/Enumeratestatus.Thesizeofaclusterwasbasedonthegreateroftheinde-pendentlistinghousingunitcountortheupdatedcensusaddresslisthousingunitcountasofJanuary2000.ForList/Enumerateclustersthesizewasalwaysbasedonthe actualindependentlistingcountsincetheList/Enumerateoperationhadnotyetstartedbythetimeofthisreduc-tion.TheAmericanIndianCountrystatushadthreecat-egoriesasdescribedinthefirst-phaseofsampling.Table3-9containsthereductionstrataforsmallblock clusters.Table3-8.Second-PhaseResultsMediumandLargeBlockClusterandHousingUnitCountsNumberof....
Minority Low inconsistent HighinconsistentConsistent Stratum jumpers American IndianreservationsNationHousingunits 1..........................230,52949,08694,850403,80632,0649,251819,586Clusters...
.............................2,5539718424,8012433559,765 1IndependentListingcountsasofDecember,1999.SectionIChapter33-13DesignoftheA.C.E.SampleU.S.CensusBureau,Census2000 Table3-9.SmallBlockClusterSecond-PhaseStrata Second-phase stratum Housing units AmericanIndiancountry List/Enumerate status10to2NoNo23to5NoNo 36to9NoNo 410+--50to2NoYes63to9NoYes 70to9Reservation/Trustland-80to2TJSA/TDSA/ANVSA 1-93to9TJSA/TDSA/ANVSA-1TribalJurisdictionStatisticalArea/TribalDesignatedStatisticalArea/AlaskaNativeVillageStatisticalAreaDeterminingtargetsamplingrates.Usingindepen-dentlistinghousingunitcounts,targetsamplingrates weredetermined.Theseratesattemptedtosatisfythepre-viouslydiscussedstatisticalandoperationalgoals.Generally,thesmallclusterswerestratifiedintofourgroupsbasedonthenumberofhousingunitsintheclus-ter.Allclusterswithtenormorehousingunits,onAmeri-canIndianland,orclassifiedasList/Enumeratewereretainedinsample.Fortheremainingthreereductionstrata,somedifferentialsamplingwasintroduced.Todeterminethesamplingratesforthesestrata,twocon-ditionswereimposed.Oneoftheseconditionswasthat,if possible,thenumberofweightedhousingunitsinaclus-terdidnotexceed2,400housingunits.Throughcom-putersimulations,anumberofdifferentlimitsweretrieduntilacapof2,400yieldedasampleofappropriatesize.
Thesecondconditionwasaminimumsamplingrate,whichvariedamongthethreestrata.Table3-10containsasummaryofthesamplingconditions.Table3-11illustratestheprocessfordeterminingthesecond-phasesampling rateforeachstratum.Theoveralltargetselectionprobabilitywasbasedonthemaximumnumberofhousingunitswithinastratumand thepreviouslymentionedcapof2,400housingunits.Forexample,themaximumnumberofhousingunitsinstra-tumgrouponewastwo.Hence,theoveralltargetselec-tionprobabilitywas1in(2,400/2)or1in1,200.Thesam-plingrateforeachsecond-phasestratumwasthensetattheraterequiredtoattaintheseoveralltargetprobabilitiesofselection.SortingthePSUs.Thefirst-phaseclustersweresortedinthefollowingorderineachsecond-phasestratum:*1990urbanization*countycode
- A.C.E.clusteridentificationnumberTable3-10.SmallClusterReductionSamplingConditions Second-phase stratum Clustersize(HUs)OveralltargetselectionprobabilityMinimumsecond-phasesamplingrate10to21/1,2001/1023to51/4801/4 36to91/2671/2.22Table3-11.Second-PhaseSamplingRateCriterionIf...Then,thesecond-phasesamplingrateequals...OveralltargetselectionprobabilityMinimumsecond-phasesamplingrateFirst-phasesamplingrateOveralltargetselectionprobabilityFirst-phasesamplingrateOveralltargetselectionprobability<Minimumsecond-phasesamplingrateFirst-phasesamplingrateMinimumsecond-phasesamplingrate3-14SectionIChapter3DesignoftheA.C.E.SampleU.S.CensusBureau,Census2000 Selectingasubsample.Separatesampleswereselectedfromeachsecond-phasestratumwithineachstateandthe DistrictofColumbia.Thisrequiredcalculatingtheactual samplingrateforthestratum,sortingtheclustersand drawingasystematicsampleofclusters.Allclusterswith10ormorehousingunitsthatwereclas-sifiedasList/Enumerate,orwereinAmericanIndianCoun-try,wereretainedinsample.Thesamplingratesforthe remainingthreestratawerecomputedtoachieveaninte-gernumberofblockclustersdrawnfromeachstratum,similartoproceduresusedforthemediumandlargeclus-terreduction.Thisrequiredcomputingasamplingrate, whichresultedinanonintegerexpectednumberofclus-tersdetermininganintegernumberofclusterstoselect, andcalculatingthefinalsamplingratebasedontheratioofclusters.Thesmallclusterreductionfollowedthesam-plingprocedurediscussedearlier.Thisresultedinatotalof1,538outof5,000smallclus-tersretainedintheA.C.E.sampleforthe50statesandthe DistrictofColumbia.Smallclusterreductionresults.Table3-12givesthedistributionofblockclustersandhousingunitsaftersmallblockclusterreduction.Asmentionedearlier,thelargeroftheindependentlistinghousingunitcountandthehous-ingunitcountfromtheupdatedcensusaddresslistasofJanuary2000wasusedtostratifytheclusters.InTable3-12,onlytheindependentlistinghousingunitcountis usedinthesetallies.Hence,with55clusters,asseeninthe6-9clustersize,thenumberofhousingunitsdoesnotachievetheminimumof330.Second-PhaseSamplingResultsTable3-13liststheblockclustersamplesizesandthenumberofhousingunitsineachstate,the DistrictofColumbia,andthenationafterthesecondphaseofA.C.E.sampling.Table3-12.Second-PhaseResultsSmallBlockClusterandHousingUnitCountsClustersize (HUs)1AmericanIndian country List/enumerate statusNumberofhousingunits 2Numberof clusters0-2NoNo2096923-5NoNo3581176-9NoNo3255510+--4,5321120-2NoYes59290 3-9NoYes7616 0-9Reservation/Trustland-43128 0-2TJSA/TDSA/ANVSA 3-401213-9TJSA/TDSA/ANVSA-307Total5,6721,538 1ThesizeofaclusterwasbasedonthehigheroftheindependentlistinghousingunitcountortheJanuary,2000censusaddresslist.ForList/EnumerateclustersthesizewasalwaysbasedontheactualindependentlistingcountsincetheList/Enumerateoperationhadnotyetbeenstartedbythetimeofthisreduction.
2KeyedindependentlistinghousingunitcountsasofJanuary,2000.
3TribalJurisdictionStatisticalArea/TribalDesignatedStatisticalArea/AlaskaNativeVillageStatisticalArea.SectionIChapter33-15DesignoftheA.C.E.SampleU.S.CensusBureau,Census2000 Table3-13.StateSecond-PhaseSampleResultsbyFirst-PhaseStratum StateSecond-phasehousingunits 1Second-phaseblockclustersSmallMediumLargeAIRTotalSmallMediumLargeAIRTotal Alabama......................543,5997,531011,18414104430161 Alaska........................241,4013,099164,54074022170 Arizona.......................1403,08217,1852,82623,233697961113322 Arkansas.....................162,0773,56605,6591371240108 California.....................40119,12477,91320497,64293528469111,101 Colorado......................192,7229,2485212,0412485552166 Connecticut...................371,6996,71808,454559470111 Delaware.....................71,5722,97904,55834023066DistrictofColumbia............01,2515,40306,65422531058Florida........................2657,97654,9862063,247432592301533 Georgia.......................2114,09521,195025,501271381110276 Hawaii........................111,20022,252023,463640750121 Idaho.........................121,6322,7141524,5103253166107Illinois........................517,52720,041027,619252471310403Indiana.......................1254,1417,431011,69724141460211 Iowa.........................1612,3383,70506,2042179220122Kansas.......................332,1933,488315,7452470221117 Kentucky.....................922,3299,621012,0421492520158 Louisiana.....................73,3326,57409,91340109500199 Maine........................381,4472,02013,506245316194 Maryland.....................223,28817,041020,351677820165 Massachusetts................1053,46711,471015,04310120800210Michigan......................646,61213,58114820,40519227925343Minnesota....................793,2107,27528610,850281164910203 Mississippi....................842,4992,957965,6362076253124 Missouri......................2693,22911,558015,05624113510188 Montana......................151,8802,3659055,16541601424139Nebraska.....................251,6851,317913,1183153133100 Nevada.......................11,3618,50620410,0723828305101NewHampshire...............501,6582,53504,243114619076NewJersey...................44,88314,960019,84781471030258NewMexico...................291,8132,6661,8546,36276471970212NewYork.....................5828,25662,6169371,547342713175627NorthCarolina.................3005,14918,90113624,48628151934276NorthDakota..................351,3322,0763943,83734581712121 Ohio.........................1466,90622,631029,683222301270379 Oklahoma.....................962,5575,1422678,06210489318232Oregon.......................72,1657,2311249,5275270443169 Pennsylvania..................2038,62215,227024,052282931070428RhodeIsland..................61,5172,51704,04044718069SouthCarolina................1133,5409,094012,7471588390142SouthDakota.................221,3072,6134534,39540551427136Tennessee....................3814,00010,436014,81724125580207Texas........................71413,47347,0113061,2281494052381793 Utah.........................1122,5834,0611346,8902948237107Vermont......................161,1913,23704,444104520075Virginia.......................623,44320,872024,377151311120258Washington...................2253,32012,97643816,959331067617232WestVirginia..................241,2634,66605,953104623079Wisconsin.....................1644,3805,90921910,672241383910211Wyoming.....................131,7781,186893,0666162115139TotalU.S......................5,672187,104642,3039,263844,3421,5385,8803,53035511,303 1KeyedindependentlistinghousingunitcountsasofJanuary2000.Keyedimpliesthesecountswentthroughaqualitycontrolreview.Conse-quently,smalldiscrepanciesmayexistbetweentheseindependentlistinghousingunitcountsandthosefromTable3-8.3-16SectionIChapter3DesignoftheA.C.E.SampleU.S.CensusBureau,Census2000 THIRDPHASEOFTHEA.C.E.SAMPLEDESIGNInverylargeblockclusters,thehousingunitswithintheclusterweresubsampled.Thisachievedmanageablefield workloadsforA.C.E.interviewingandpersonfollow-up withouthavingabigimpactonreliability.Thestrategyof theA.C.E.largeblockclustersamplingplanwasto increasethenumberofclustersinsample,whilestill attainingthetargetednumberofhousingunitsforinter-view.Becausehousingunitsinablockclusterareoften similar,interviewingallofthemisnotthemostefficient useofresources.Instead,interviewingamanageablefrac-tionofseveraldifferentclustersprovidesamoregeo-graphicallydiversesample.Inthefirst-phasesampling,largeblockclustershadahigherselectionprobabilitythanmediumblockclusterstotakeintoaccountthisanticipated,subsequenthousing unitreduction.TheA.C.E.second-phasereductionmain-tainedthedifferentialselectionprobabilitiesbetweenthelargeandmediumblockclusters.Afterthereductionof housingunitsinlargeblockclusters,thehousingunitselectionprobabilitiesinmediumandlargeblockclustersinthesamesecond-phasesamplingstratumweresimilar.AnotherimportantgoalofthishousingunitreductionwastogeographicallyoverlapthePandEsamplestoreduce theE-samplepersonfollow-upworkload.AnoverlappingP andEsamplewasnotnecessary,butimprovedthepreci-sionofdualsystemestimates,thecost-effectivenessofthesucceedingoperation,andthedataprocessing efficiency.IdentifyingtheP-SampleHousingUnitsThesourceoftheP-samplehousingunits,whichweresubjecttopersoninterviewingbythefieldstaff,wasthe independentlylistedhousingunitsthatwereconfirmedtoexistfollowingthehousingunitmatchingandfollow-upoperations.(SeeChapter4.)Inblockclustersthathad fewerthan80ofthesehousingunits,allofthehousingunitsweredesignatedtobeinthePsample.Inaddition,allhousingunitsinablockclusterselectedfromthe AmericanIndianReservationstratumwereintheP sample,regardlessofhowmanyhousingunitswereintheblockcluster.Mostblockclustersfromthisstratumwereexpectedtohavefewerthan80housingunitsanditwas desirabletoavoidintroducingweightvariationtothesamplecasesforthisstratum.Forblockclusterswith80ormorehousingunits,thehousingunitsweresub-sampledandtheselectedhousingunitswereintheP sample.Thereductionofhousingunitswithinalargeblockclusterwasdonebyforminggroupsofadjacenthousingunits calledsegmentsandselectingoneormoresegmentsofhousingunitstoparticipateinthePsample.Thesegmentshadapproximatelyequalnumbersofhousingunitswithin ablockcluster.Segmentsofhousingunitswereusedasthesamplingunitsinordertoobtaincompactinterview-ingworkloadsandtofacilitateoverlappingPandE samplestoreduceE-samplepersonfollow-upworkloads.Flowofoperations.Acomplicationofthisprojectwasthatlargeblockclusterswerereadyforthehousingunit subsamplingonaflowbasisastheprecedingoperations, housingunitmatchingandfollow-up,werecompleted.Toremainonschedule,itwasessentialthattheP-samplehousingunitswereselectedandpreparedforinterviewas quicklyaspossible.Thismeantthatsamplingparameterswerecomputedbasedonthehousingunitcountsfromtheindependentlisting.Ifschedulinghadnotbeenanissue, thehousingunitcountsfromthehousingunitmatchingandfollow-upwouldhavebeenused.Thetimescheduleconstraintsdidnotpermittheentirecountrytobepro-cessedpriortosubsampling.Further,therewasnopre-specifiedorderinwhichblockclusterswerereadyforhousingunitsubsampling.Thus,followingtheflowof blockclustersfromtheprecedingoperations,thehousingunitsubsamplingwasperformeddaily.Stratifyingthird-phaseclusters.Beforeselectingthesampleofsegments,blockclustersweredividedintosevenstratawithineachstate.Thefirstfivestratawerethesamestratausedforthesecondphaseofsamplingfor themediumandlargefirst-phasestrata.Thesixthstratumwasthesmalltolargestratumjumpers,blockclustersfromthesmallstratumobservedtohavemorethan80 housingunitsduringtheindependentlisting.Theseventhstratumwasequivalenttothefirst-phaseAmericanIndianReservationstratum,forwhichnohousingunitreduction wasdone.Allocatingthesample.Nationally,thetargetdistribu-tionofthe300,000P-samplehousingunitsamplewas roughlyproportionaltopopulationsize,exceptforincreasesinsamplesizeinthesmallerstates,whichhadroughlyequalsizes.Thesecond-phaseintroduceddiffer-entialsamplingwithineachstateandgeneratedoverall targetsamplesizesforeachreductionstratuminthestate,theT gintheearliersection.Basedonthesetargetsandtheobservedsecond-phasesampleblockclusters,the samplewasallocatedtoeachstratumtoprovideapproxi-matelyequaloverallprobabilitiesofselectionforhousingunitsfromthesamestratum.Determiningsamplingparameters.Separatesam-plingparameterswerecomputedforeachstratumwithinastate.Foreachstratum,theselectionprobabilitywasthe ratioofthetargetnumberofhousingunitsfromlargeblockclustersoverthenumberofhousingunitsfromtheindependentlistinginlargeblockclusters.Within-clustersamplingrate=TargethousingunitsamplesizeinlargeblockclustersNumberoflistedhousingunitsinlargeblockclustersSectionIChapter33-17DesignoftheA.C.E.SampleU.S.CensusBureau,Census2000 Thetargethousingunitsamplesizewasderivedbysub-tractingthenumberofhousingunitsinmediumblock clustersbasedontheindependentlistfromthetarget stratumsamplesize.Whentallyingthehousingunit countsfromtheindependentlist,anyhousingunitsclassi-fiedasfutureconstructionwereomittedfromthecount.
Althoughsomeofthisfutureconstructionwasprobably goingtobebuiltbyCensusDay,itwasexpectedtobea rareoccurrence.Withinaparticularstratuminastate,afixednumberofsegmentswasformedineachblockcluster.Thisnumber wasafunctionofthewithin-clustersamplingrate.This methodyieldeddifferentsizesegmentsacrossblockclus-terswithinthesamestratum.Thismethodisatrade-off betweenhavingfewersegmentstoreducenonsamplingerrorandhavingmoresegmentsofafixedsizetoreduce samplesizevariation.Nonsamplingerrorwasreducedbyhavingfewersegmentboundariestoidentify.Ifthewithin-clustersamplingratewaslessthanorequalto0.5,thenNumberofsegments1within-clustersamplingrateroundeduptothenearestinteger.Whenthewithin-clustersamplingratewasgreaterthan0.5,theaboveformularesultsinonlytwosegmentsresultinginincreasedsamplesizevariationwiththelargersegmentsize.Tobettercon-trolsamplesizevariationwhenthesamplingratewasgreaterthan0.5,thenumberofsegmentswascalculated asNumberofsegments1(1-within-clustersamplingrate)Formingthesegments.Withineachblockclusterthehousingunitsweresortedbycensusblockandgeo-graphiclocationwithintheblock.Thenbasedonthenum-berofsegments,approximatelyequalnumbersofhousing unitswereassignedtoeachsegment.Selectingasubsample.Within-clustersubsamplingwasdonedailyastheclusterscompletedthehousingunit matchingandfollow-upoperations.Despitethedailypro-cessing,thesubsamplingwasequivalenttoaone-timesample,sincetheresultsofthepreviousdaywerecarried overtothenextandcontinued.Theonedifferencewiththedailyoperationwastheinabilitytocontroltheblockclustersortacrossallblockclustersinthestratumdueto theflowoftheblockclusters.So,eachdaytheblockclus-tersthatweretobesubsampledweresortedbyblockclusternumberwithineachstratum.Asampleofsegmentswasselectedbytakingonesystem-aticsampleacrossalllargeblockclustersineachstratumwithinastate.Selectingonesystematicsamplepersam-plingstratum,ratherthanaseparatesamplefromeach largecluster,reducedsamplesizevariability.Thisallowedanobservedsamplesizeclosetothetargethousingunitsamplesizetobeachieved.P-SampleResultsFollowingwithin-clustersubsampling,thesampleforthe50statesandtheDistrictofColumbiawas11,303blockclusterscontainingabout301,000housingunits.
Table3-14displaystheresultsforeachstate.3-18SectionIChapter3DesignoftheA.C.E.SampleU.S.CensusBureau,Census2000 Table3-14.StateThird-PhaseSampleResultsforthePSample StateHousingunitcounts 1byclustersize 2Blockclustercountsbyclustersize 20-7980+AIRTotal0-7980+AIRTotal Alabama...........................2,9471,50304,450115460161 Alaska.............................1,152587161,7394821170Arizona...
..........................5,1932,4742,6617,66715455113322 Arkansas...........................1,79592102,71686220108 California
...........................18,60814,91919233,527675415111,101 Colorado...........................2,6621,491504,153113512166 Connecticut
.........................1,9711,27203,24372390111 Delaware...........................1,07769301,7704224066DistrictofColumbia
..................1,1061,08402,1902929058 Florida.............................8,7366,5182015,2543292031533 Georgia............................4,6903,07207,762183930276 Hawaii.............................1,1562,44703,60347740121Idaho..............................1,6533421461,99586156107 Illinois..............................8,5103,855012,3652921110403 Indiana.............................4,1721,77305,945169420211 Iowa...............................2,16282902,991101210122 Kansas.............................2,114552292,666101151117 Kentucky...........................2,6071,37203,979111470158 Louisiana...........................3,0311,38604,417153460199 Maine..............................1,57136111,9328013194 Maryland...........................2,5742,71305,28791740165 Massachusetts
......................4,5001,89306,393151590210 Michigan...........................7,2242,7561479,980259795343 Minnesota..........................3,7341,4202615,1541514210203 Mississippi
..........................2,332602962,93497243124 Missouri............................3,3892,12005,509141470188 Montana............................2,1466548632,8001001524139 Nebraska...........................1,736225791,96186113100 Nevada............................1,1419731892,11470265101NewHampshire
.....................1,15660901,7655323076NewJersey.........................5,3692,90208,271175830258NewMexico...
.....................2,6009881,7363,5881192370212NewYork...........................9,3019,3908818,6913322905627NorthCarolina...
...................4,4053,438937,843177954276NorthDakota...
....................1,7804043812,184951412121 Ohio...............................7,3693,973011,3422621170379 Oklahoma..........................2,6969702603,666193318232 Oregon.............................1,8661,6061243,472127393169 Pennsylvania
.......................9,4632,801012,264344840428RhodeIsland
.......................1,20057401,7744920069SouthCarolina
......................2,5051,99404,499103390142SouthDakota
.......................1,6575204392,177951427136Tennessee...
.......................4,0711,74805,819156510207Texas..............................13,0317,3312920,3625882041793 Utah...............................1,6408461222,48673277107Vermont............................1,34557101,9165718075Virginia..
...........................3,7653,12206,8871561020258Washington
.........................4,0642,0434166,1071476817232WestVirginia...
.....................1,10876901,8775623079 Wisconsin..........................4,3231,1862095,5091642710211Wyoming...........................1,391527831,918121135139TotalU.S..
..........................184,155108,0288,730300,9137,7743,17435511,303 1ThesourceoftheP-samplehousingunitcountswastheindependentlistthatwasconfirmedtoexistfollowingthehousingunitmatchingandfollow-upoperations.
2ClustersizewasbasedonnumberofconfirmedA.C.E.housingunitsafterhousingunitmatchingandfollow-up.IdentifyingtheE-SampleHousingUnitsTheEsampleconsistedofthecensusenumerationsinthesamesampleareasasthePsample.Thesourceofthe E-samplehousingunitswastheuneditedcensusfiles.Like thePsample,allhousingunitsinblockclustersthathadfewerthan80censushousingunitsorinblockclustersselectedfromtheAmericanIndianReservationstratumweredesignatedtobeintheEsample.Forblockclusterswith80ormorehousingunits,thehousingunitswere reducedandtheselectedhousingunitswereintheE sample.SectionIChapter33-19DesignoftheA.C.E.SampleU.S.CensusBureau,Census2000 ThereductionofhousingunitswithinalargeblockclusterwasdonebymappingtheP-samplesegmentsontothe censushousingunits.Thiswaspossiblebecausewhen therewasamatchbetweenanA.C.E.independentlylisted addressandacensusaddressduringthehousingunit matching,thecensusidentificationnumberwaslinkedto theA.C.E.unit.ThenthesamesegmentselectedfortheP samplewasselectedfortheEsample.Thecensusinventoryofhousingunitschangedbetweenthehousingunitmatchingoperationandtheidentification oftheEsample.Therefore,somecensushousingunitsdidnothavealinkwithanA.C.E.unit.Thesecaseswereassignedtoasegmentusingpre-specifiedrules.Some-timestherewerealargenumberofthesecasesintheseg-mentselectedtobeinsample.Ifthereweremorethan80ofthese,thenanadditionalsubsamplewasdrawnfrom thesecensushousingunitswithoutalinktoanA.C.E.
unit.Thedata-definedcensuspersonenumerationsintheE-samplehousingunitswereintheEsample.Tobeacen-susdata-definedperson,thepersonrecordhadtwo 100-percentdataitemsfilled.Namewasnotrequiredfor thepersonrecordtobeconsidereddata-defined,butcouldbeoneofthetwoitemsrequiredtobedata-defined.CensushousingunitsnotavailablefortheEsample.NotallhousingunitsontheuneditedcensusfilewereeligibletobeintheEsample.Asthecensusenu-merationswerebeingprocessed,theCensusBureausus-pectedthattherewasasignificantnumberofduplicateaddressesinthecensusfiles.Assuch,anewcensusoperation,theHousingUnitDuplicationOperation,was introducedinthefallof2000.Theprimarygoalofthisoperationwastoimprovethequalityofthecensus;how-everitsdesignallowedtheA.C.E.operationstoproceed.
Essentially,suspectedduplicatehousingunitswereset asideandanalyzedfurther.Thesehousingunitsandthecorrespondingcensuspersonenumerationswerenoteli-giblefortheE-samplecomponentoftheA.C.E.noravail-ableforpersonmatchingandwereexcludedfromthedual-systemestimationcalculation.Someoftheseset-asidehousingunitsandthecorrespondingcensusenu-merationswerelaterputbackintothefinalcensuscounts.Subsamplingcriteria.Ifablockclustercontained80orfeweravailablecensushousingunits,thenallavailablecensushousingunitswereintheEsample.Iftheblock clusterwasfromtheAmericanIndianReservationstratum, allavailablehousingunitswereintheEsample.Iftheblockclusterhad80ormoreavailablecensushousingunits,thehousingunitsweresubsampled.Assigninghousingunitstosegments.Withinablockcluster,thecensushousingunitswereassignedtoaseg-mentbasedonthelinktoanA.C.E.housingunitaddress.IftherewasalinkwithanA.C.E.unit,thenthecensushousingunitwasassignedtothesamesegmentasthe A.C.E.unit.ThishelpedtocreateoverlappingPandE samples.Sometimesacensushousingunitdidnothavea linkwithanA.C.E.housingunit.Whenthishappened,all theavailablecensushousingunitsweresortedandthen eachcensushousingunitwithoutalinkwasassignedto thesamesegmentastheprecedingcensushousingunit.
Whentheblockclustercontainedcity-styleaddresses,the censushousingunitsweresortedbycensusblocknum-ber,streetname,housenumber,andunitdesignation.
Whentheblockclustercontainednon-city-styleaddresses, thecensushousingunitsweresortedbycensusblock numberandgeographiclocationwithintheblock.Forcity-stylecensusaddresses,geographiclocationwasnotavail-
able.SelectingtheE-samplehousingunits.Onceallthecensushousingunitswithinablockclusterwereassigned toasegment,thenthecensushousingunitsintheseg-mentorsegmentsselectedforthePsamplewereintheEsample.Occasionally,theselectedsegmentorsegmentswithintheblockclustercontainedmorethan80census housingunitsthatdidnotlinktoanA.C.E.housingunit.Whenthisoccurred,anadditionalstepofsubsamplingwasdonetoreducetheEsamplefollow-upworkload, sincethecensushousingunitswithoutthislinkweremorelikelytocontributetothefollow-upworkloadthancensushousingunitswiththislink.AsystematicsubsampleofcensushousingunitswithoutalinktoanA.C.E.housingunitwasdrawn.Usingthesamesortusedforassigninghousingunitstoasegment,asub-sampleof40housingunitswasselectediftheresulting subsamplingratewasgreaterthan0.25.However,to avoidexcessivesamplingweightvariation,theminimumsubsamplingratewassetto0.25,resultinginmorethan40censushousingunitswithoutalinktoanA.C.E.hous-ingunitbeingintheEsamplefromtheparticularblockcluster.Specialcaseblockclusters.Therewerespecialcaseblockclusterswhennoneofthecensushousingunitsina blockclusterlinkedtoanA.C.E.housingunitaddressatthetimeofthehousingunitmatching.OneexampleofaspecialcasewasaList/Enumeratecluster,sincethe List/Enumerateoperationhadnotbeenconductedbythetimethatthehousingunitmatchingwasdone.NoneofthehousingunitsinaList/Enumerateclustercouldbe assignedtoasegment.Insteadofselectingacompact segmentofhousingunitstobeintheEsample,asystem-aticsubsampleofthehousingunitswasdrawnusingthesamemethodasdiscussedabove.Thispreventedoverlap-pingthePandEsampleswhentheseblockclusterswerelarge.Thisdidnothappenoften.3-20SectionIChapter3DesignoftheA.C.E.SampleU.S.CensusBureau,Census2000 E-SampleResultsFollowingE-sampleidentificationandsubsampling,theEsampleforthe50statesandtheDistrictofColumbiawas11,303blockclusterscontainingabout311,000housingunits.Table3-15displaystheresultsforeachstate.Table3-15.StateThird-PhaseSampleResultsfortheESample StateHousingunit 1countsbyclustersize 2Blockclustercountsbyclustersize 20-7980+AIRTotal0-7980+AIRTotal Alabama...........................2,7761,79304,569113480161 Alaska.............................925926161,8674425170Arizona...
..........................2,8192,5212,5217,86115257113322 Arkansas...........................1,8381,11802,95685230108 California
...........................17,90616,22827134,405658432111,101 Colorado...........................2,6231,587494,259114502166 Connecticut
.........................2,0741,24103,31573380111 Delaware...........................1,27065901,9294521066DistrictofColumbia
..................1,1441,21602,3602929058 Florida.............................8,1087,0372615,1713202121533 Georgia............................4,3733,34607,719179970276 Hawaii.............................1,3232,65303,97649720121Idaho..............................1,2438501552,24872196107 Illinois..............................8,1904,302012,4922881150403 Indiana.............................4,0821,87005,952170410211 Iowa...............................2,23790703,144102200122 Kansas.............................2,097734272,858100161117 Kentucky...........................2,3601,69204,052107510158 Louisiana...........................3,0781,80904,887152470199 Maine..............................1,59542912,0258013194 Maryland...........................2,6512,78605,43792730165 Massachusetts
......................4,2492,73606,985146640210 Michigan...........................6,6823,31114610,139253855343 Minnesota..........................3,1831,7202605,1631484510203 Mississippi
..........................2,3746471143,13599223124 Missouri............................3,2312,09805,329141470188 Montana............................1,4805218662,867981724139 Nebraska...........................1,637258681,96386113100 Nevada............................9071,1751332,21567295101NewHampshire
.....................1,42646701,8935719076NewJersey.........................4,9523,66608,618170880258NewMexico...
.....................1,1367541,5363,4261212170212NewYork...........................9,11411,0718420,2693262965627NorthCarolina
......................4,5103,2531017,864182904276NorthDakota...
....................1,4014823582,241951412121 Ohio...............................7,2234,016011,2392631160379 Oklahoma..........................2,3661,0382653,669193318232 Oregon.............................1,6442,3781254,147122443169 Pennsylvania
.......................9,1433,449012,592336920428RhodeIsland
.......................1,19455601,7505019069SouthCarolina
......................2,5021,96804,470105370142SouthDakota
.......................1,2784954332,206951427136Tennessee...
.......................4,0222,42906,451157500207Texas..............................12,4129,2132721,6525742181793 Utah...............................1,4348181232,37575257107Vermont............................1,26364001,9035619075Virginia..
...........................3,5553,73107,2861521060258Washington
.........................3,3712,6094116,3911447117232WestVirginia...
.....................1,17372401,8975722079 Wisconsin..........................4,0671,1592115,4371673410211Wyoming...........................1,337554841,975121135139TotalU.S..
..........................178,978123,6408,411311,0297,6903,25835511,303 1Availablehousingunitcountsfromtheuneditedcensusfile.
2Clustersizewasbasedonavailablecensushousingunittallies.SectionIChapter33-21DesignoftheA.C.E.SampleU.S.CensusBureau,Census2000 Third-PhaseSamplingResultsTable3-16givesthestateweightedandunweightedP-sampleandE-samplehousingunits.Alsodisplayedare theaverageP-sampleandE-sampleweights,priorto weighttrimming,TESadjustment,andnonresponse adjustments.Theaverageweightsrangedfromapproxi-mately100to500.InTable3-16,formostofthestates,theaverageE-sampleweightissmallerthantheaverageP-sampleweight.
Nationally,despitetheP-andE-samplesizesdifferingby about10,000housingunits,afterapplyingtheweight,the weightednumberofP-samplehousingunitsislessthan onepercentlargerthantheweightednumberofE-samplehousingunits.Table3-16.P-SampleandE-SampleHousingUnitSamplingResults StateWeightedhousingunitestimatesHousingunitsamplesizesAverageweightPsampleEsampleP/EPsampleEsampleP/EPsampleEsample Alabama...........................1,967,7031,953,5591.0074,4504,5690.974442428 Alaska.............................186,971187,6570.9961,7391,8670.931108101Arizona...
..........................2,291,7352,419,0980.9477,6677,8610.975299308 Arkansas...........................1,204,0141,214,8780.9912,7162,9560.919443411 California
...........................12,255,06612,129,8491.01033,52734,4050.974366353 Colorado...........................1,633,9801,579,0701.0354,1534,2590.975393371 Connecticut
.........................1,262,1971,249,7921.0103,2433,3150.978389377 Delaware...........................282,962285,5570.9911,7701,9290.918160148DistrictofColumbia
..................295,972295,0991.0032,1902,3600.928135125 Florida.............................7,350,6676,958,7991.05615,25415,1711.005482459 Georgia............................3,178,0033,101,3371.0257,7627,7191.006409402 Hawaii.............................446,780467,5820.9563,6033,9760.906124118Idaho..............................475,978494,3770.9631,9952,2480.887239220 Illinois..............................4,752,6164,723,1751.00612,36512,4920.990384378 Indiana.............................2,565,5592,611,2480.9835,9455,9520.999432439 Iowa...............................1,286,1591,303,3930.9872,9913,1440.951430415 Kansas.............................1,054,2771,085,0660.9722,6662,8580.933395380 Kentucky...........................1,738,6371,688,3591.0303,9794,0520.982437417 Louisiana...........................1,690,0931,767,4980.9564,4174,8870.904383362 Maine..............................606,684580,6711.0451,9322,0250.954314287 Maryland...........................2,240,4632,237,8111.0015,2875,4370.972424412 Massachusetts
......................2,637,7322,652,6990.9946,3936,9850.915413380 Michigan...........................3,945,5683,948,3480.9999,98010,1390.984395389 Minnesota..........................1,976,4101,940,3021.0195,1545,1630.998383376 Mississippi
..........................1,067,3931,065,4951.0022,9343,1350.936364340 Missouri............................2,678,9092,576,5451.0405,5095,3291.034486483 Montana............................463,607459,8841.0082,8002,8670.977166160 Nebraska...........................684,874667,5861.0261,9611,9630.999349340 Nevada............................895,050862,5091.0382,1142,2150.954423389NewHampshire
.....................558,641523,5621.0671,7651,8930.932317277NewJersey.........................3,377,9083,338,7681.0128,2718,6180.960408387NewMexico...
.....................708,714667,6201.0623,5883,4261.047198195NewYork...........................7,573,2927,706,5260.98318,69120,2690.922405380NorthCarolina...
...................3,857,1663,748,5391.0297,8437,8640.997492477NorthDakota
.......................294,040288,6771.0192,1842,2410.975135129 Ohio...............................4,785,4614,687,6801.02111,34211,2391.009422417 Oklahoma..........................1,461,1631,465,0460.9973,6663,6690.999399399 Oregon.............................1,411,6811,431,0300.9863,4724,1470.837407345 Pennsylvania
.......................5130,0105,179,1750.99112,26412,5920.974418411RhodeIsland
.......................408,426401,0221.0181,7741,7501.014230229SouthCarolina
......................2,274,3892,332,4850.9754,4994,4701.006506522SouthDakota
.......................300,952297,4921.0122,1772,2060.987138135Tennessee...
.......................2,489,6072,609,9190.9545,8196,4510.902428405Texas..............................8,116,2158,098,9231.00220,36221,6520.940399374 Utah...............................885,164823,2551.0752,4862,3751.047356347Vermont............................307,822296,4141.0381,9161,9031.007161156Virginia..
...........................2,714,8792,797,8360.9706,8877,2860.945394384Washington
.........................2,496,2692,435,1451.0256,1076,3910.956409381WestVirginia...
.....................917,901916,5521.0011,8771,8970.989489483 Wisconsin..........................2,274,7732,268,9761.0035,5095,4371.013413417Wyoming...........................190,271194,8440.9771,9181,9750.9719999UnitedStates.......................115,650,803115,016,7291.006300,913311,0290.9673843703-22SectionIChapter3DesignoftheA.C.E.SampleU.S.CensusBureau,Census2000 Attachment.Census2000TypeofEnumerationAreas(TEAs) 1ThetermTEAhasbeenusedforseveraldecennialcen-suses.ForCensus2000,itreflectsnotonlythetypeofenumeration,butalsothemethodofcompilingthecensusaddresslistthatcontrolstheenumerationprocess.TheCensusBureaudefinesTEAcodesatthecensuscollec-tionblocklevel.EachblockmusthaveaTEAcode,andnoblockmayhavemorethanoneTEAcode.TEA1-BlockCanvassingandMailout/Mailback*Containsareaswithpredominantlycity-style(housenumber/streetname)addressesusedformaildelivery.*CensusaddresslistiscreatedfromUSPS,1990census,local/tribal,andotherpotentialsupplementaryaddress sources.*BlocksareincludedinbothBlockCanvassingandthePostalValidationCheck.*Blocksareincludedinlocal/tribalprogramtoidentifynewconstruction.Mailout/mailbackisthemostefficient,cost-effectiveenu-merationmethodinheavilypopulatedareasinwhichmail isdeliveredtocity-styleaddressesinvirtuallyallcases(theremaybescatterednon-city-stylemailingaddressesinuseintheseareas).Inmostinstances,acensusenumera-torvisitsaresidenceonceduringBlockCanvassing.AsubsequentvisitissometimesnecessaryduringNonre-sponseFollow-up.Themailinglistusedforthisoperationisderivedinitiallyfromautomatedaddressfiles(theUSPSDeliverySequenceFileandthe1990CensusAddressControlFile),and updatedthroughvariousoperations,includingAddressListReview(LUCA1998),ongoingDSFupdates,BlockCan-vassing,thePostalValidationCheck,andtheNewCon-structionProgram.TEA2-AddressListingandUpdate/Leave*Containsareaswithsomenumberofnon-city-style(e.g.,P.O.BoxorRuralRoute)mailingaddresses.*CensusaddresslistiscreatedfromAddressListing,andupdatedfromAddressListReview(LUCA)1999Recan-vassing(inselectedareas)andUpdate/Leave*BlocksareNOTincludedinBlockCanvassing,thePostalValidationCheck,ortheNewConstructionProgram*PuertoRico,includingitsmilitarybases,iscompletelyinTEA2AddressListingandUpdate/Leaveareimplementedinareaswheremailoftenisdeliveredtonon-city-styleaddresses.Intheseareas,itisdifficulttoobtainanup-to-datemailingaddresslistandthengeocodeeachaddress (thatis,assignittoacollectionblockcode),becauseof theconstantlychangingresidentiallocation/mailing addressrelationship(especiallyforP.O.Boxaddresses).
Thecensusaddresslistthereforeiscompiledthrougha door-to-doorindependentlistingoperation(AddressList-ing)thatisimplementedinallTEA2blocks.DuringAddressListing,enumeratorsknockoneachresi-dencedoortoobtaintheoccupantsname,phonenumber,residentialaddress(orlocationdescription),andmailingaddress.(EnumeratorsdoNOTrevisitresidenceswhose occupantsarenotpresent.ThisiswhythecensusaddresslistfrequentlydoesNOTcontainamailingaddress,andwhythelocationdescriptionistheONLYaddressinthe censusaddresslistformanyresidences.)Enumerators identifythelocationofeachbuilding(containinglivingquarters)theyencounterwithauniquelynumberedmapspotthattheyenterontheirmapandrecordintheir addressregister;thisnumberislinkedtoallresidentialunitsinthebuilding,andstoredinboththecensusaddresslistandtheTIGERdatabase.Theseareaswillbe includedinAddressListReview(LUCA)1999.Atcensustime,enumeratorsdelivercensusquestion-nairestoallhousingunitscompiledduringAddressList-ingandthatremaininTEA2.Inthecourseofdelivering thesequestionnaires,theenumeratorsalsoupdatethecensusaddresslistandmapspottedmaptoreflecthous-ingunitsthatwerenotpreviouslylisted,andtoeliminate residencesthattheycannotlocate.(ThisoperationiscalledUpdate/Leave,becausetheenumeratorsUPDATEthecensusaddresslistandmapsandLEAVEquestion-naires.)Update/Leaveenumeratorsusetheresidential address/locationdescriptioninconjunctionwiththemapspotlocationtodeterminethecorrectdeliverypointforallquestionnaires.MosthousingunitsinTEA2areasarevisitedatleasttwicebyenumeratorsonceduringAddressListing,andagainduringUpdate/Listing.Respondentsmustmailtheircom-pletedcensusquestionnairestotheCensusBureau,andso someresidencesalsowillbevisitedathirdtime,duringNonresponseFollow-up.
1ThisdocumentationisreproducedfromtheGeographyDivision,U.S.CensusBureau,Websitelocatedathttp://www.geo.census.gov/mob/homep/teas.html.SectionIChapter33-23DesignoftheAccuracyandCoverageEvaluationSampleU.S.CensusBureau,Census2000 TEA3-List/Enumerate*Containsareasthatareremote,sparselypopulated,ornoteasilyaccessible*Censusaddresslistiscreatedandenumerationcon-ductedconcurrently*BlocksarenotincludedinBlockCanvassing,thePostalValidationCheck,theNewConstructionProgram,orAddressListing*IncludesallmilitarybasesinTEA3areas*Allislandareas(exceptPuertoRico),includingtheirmili-tarybases,areTEA3Someareasareremote,sparselypopulated,and/ornoteasilyvisited.Manyoftheresidencesintheseareasdonothavecity-stylemaildelivery.Itisinefficientandexpensive toimplementAddressListing,Update/Listing,andNonre-sponseFollow-upoperationsinvolvingmultiplevisits.Instead,thecreationoftheaddresslistandthe delivery/completionofthecensusquestionnaireareaccomplishedduringasingleoperation,List/Enumerate.EnumeratorsvisitresidencesinTEA3blocks,LISTthem forinclusioninthecensusaddresslist,marktheirlocationontheirmapwithamapspotandnumber,enterthatmapspotnumberintheiraddressregister,andENUMERATEthe residentson-site.Theycollectthesameaddressinforma-tionasinAddressListing,andincludeamapspottoreflecteachbuildingthatcontainsoneormorelivingquar-ters.TheseareaswillNOTbeincludedinanyAddressList Review(LUCA)program,becausethereisnoaddresslistfortheminadvanceofthecensus.TEA4-RemoteAlaska*SimilartoList/Enumerate,butconductedearlier,beforeicebreakup/snowmelt*TheseareaswillNOTbeincludedinanyAddressListReview(LUCA)program,becausethereisnoaddresslistfortheminadvanceofthecensusTEA5-RuralUpdate/Enumerate*ContainsblocksinitiallyinTEA2,withmapspotsforallstructurescontainingatleastonehousingunit*Insomeinstances,blocksinitiallyinTEA3willbecon-vertedtoTEA5.Theseblockswerenotincludedin AddressListingandLUCA1999,andthereforelack structuresandmapspotsintheMAFandTIGERatthetimesthatLUCA1999andRuralUpdate/Enumerateareconducted*Self-enumeration(throughUpdate/Leave)isthoughttobeunlikelyorproblematic*Censusaddresslistisupdated,andenumerationiscon-ducted,concurrently*BlocksareNOTincludedinthePostalValidationCheckortheNewConstructionProgram*ThetermruralreflectsAddressListingastheinitialsourceofthecensusaddresslist,anddoesNOTreflect theofficialcensusdefinitionofthetermrural*TheseareaswillbeincludedinAddressListReview(LUCA)1999materials,astheMAFwascompiledini-tiallyfromAddressListingInsomeareasthatotherwisemeetthecriteriaforinclu-sioninTEA2,theCensusBureauhasdecidedthathavingrespondentsenumeratethemselvesandreturntheirques-tionnairesviathemailisnotthebestwaytoconducttheenumeration.Sometargetedpopulationsmaybelesslikelytoreturntheirquestionnairesinthemail,andmore likelytorespondtoanenumerator.Inotherareas,housingunitsmaybevacantbecausetheyareoccupiedseasonally.Intheseandcomparablesituations,enumeratorsvisitallresidencesonthecensusaddresslistandcompletetheenumerationon-site.Inthecourseofdeliveringthese questionnaires,theyalsoupdatethecensusaddresslistto1)reflecthousingunitsthatwerenotpreviouslylisted(includingamapspottoreflecteachbuildingthatcon-tainsoneormorelivingquarters),and2)eliminatehous-ingunitsthattheycannotlocate.(ThisoperationiscalledRuralUpdate/Enumerate,becausetheenumeratorsworkinareasthatwereAddressListed,UPDATEthecensus addresslist[andassignmapspotsaswell],andENUMER-ATEtheresidents.)TEA6-Military*ContainsblockswithinTEA2thatareonmilitarybases*Mailout/Mailbackforfamilyhousing*Separateenumerationproceduresforbarracks,hospi-tals,etc.*BlocksareincludedinbothBlockCanvassingandthePostalValidationCheck*TheseblocksareincludedinAddressListReview(LUCA)1998materials,astheMAFwascompiledinitiallyinthesamemannerasTEA1areasTheDepartmentofDefensehasadvisedtheCensusBureauthatvirtuallyallfamilyhousing(thatis,individual residencesasopposedtobarracks,hospitals,andjails)areassignedcity-styleaddressestowhichthePostalServicedeliversmail.TheCensusBureauthereforeimplements Mailout/Mailbackmethodstoenumeratethepopulationoftheseindividualresidences.WithinTEA1areas,blocksonmilitarybasesareassignedaTEAcodeof1.WithinTEA2 areas,blocksonmilitarybasesareassignedaTEAcodeof6.ThereisnodifferencebetweenTEA1blocksonmilitary3-24SectionIChapter3DesignoftheAccuracyandCoverageEvaluationSampleU.S.CensusBureau,Census2000 basesandTEA6blocksintermsofeithercompilingthecensusaddresslistorenumeratingthepopulation.Blocks withinmilitarybasesinList/Enumerateareas(TEA3)also areTEA3.TEA7-UrbanUpdate/Leave*ContainsblocksinitiallyinTEA1
- Censusaddresslistisupdated,andquestionnairesaredeliveredconcurrently,byCensusBureaustaff(follow-ingproceduresemployedinTEA2areas,butwithoutassigningmapspots)*BlocksAREincludedinthePostalValidationCheckandtheNewConstructionProgram*Thetermurbanreflectsthepredominanceofcity-styleaddresses,anddoesNOTreflectofficialcensusdefini-tionofthetermurban*TheseblocksareincludedinAddressListReview(LUCA)1998materials,astheMAFwascompiledinitiallyinthesamemannerasTEA1areasInmanyareaswheremailisdeliveredmostlytocity-styleaddresses,olderapartmentbuildingsarecommon.Inmanyofthesebuildings,unitdesignators(thatis,apart-mentnumbers),oftendonotexist.Further,thesubdivi-sionofexistingunitsintomultipleunits,andtheconver-sionofnon-residentialspacetolivingquarters,maybefrequent.Mail,therefore,oftenisnotdeliveredtoindi-vidualapartments(orindividualmailboxes),butinstead leftatcommondroppoints.Insomeotherareaswithmostlycity-styleaddresses,manyresidentshaveelectedtoreceivetheirmailatpost officeboxes.TheCensusBureauisconcernedthatthecity-styleaddressesoftheseresidentsmaynotappearinthecensusaddresslist.Toensurequestionnairedeliverytothelargestnumberofresidences,Update/Leaveproceduresareemployed.Astheseresidenceshavecity-styleaddresses,thereisno needforenumeratorstoassignmapspotstoassistenu-meratorsinidentifyingtheseresidencesinsubsequent operations.TEA8-UrbanUpdate/Enumerate*ContainsblocksinitiallyinTEA1,withoutmapspotsforanyaddresses;mapsgeneratedforTEA8areaswillnotincludemapspots*ContainsmostlyblocksonthoseAmericanIndianreser-vationsthatinitiallywereincludedinbothTEA1and eitherTEA2or3*SameenumerationproceduresasTEA5*ThetermurbanreflectstheinitialinclusionoftheblockinTEA1duetothepredominanceofcity-style mailingaddresses*TheseareasareincludedinBlockCanvassingandthePostalValidationCheckMostAmericanIndianReservationswillbeenumeratedusingasingleenumerationprocedure(Mailout/Mailback, Update/Leave,orUpdate/Enumerate).Someoftheseini-tiallycontainedblockswithamixtureofTEAcodes.Intheseinstances,thereservationswillbeenumeratedusing Update/Enumeratemethods(seeTEA5).However,foraffectedblocksinitiallyinTEA1,theMAFandTIGERdonotincludemapspotsforstructurescontainingatleast onehousingunit.InsteadofconvertingtheseblockstoTEA5(RuralUpdate/Enumerate)anddeterminingmapspotlocations,theblocksarebeingdistinguishedbya separateTEA.TEA9-AdditionstoAddressListingUniverseof Blocks*Containsgroupsofblocks(assignmentareas)initiallyassignedtoTEA1*ConvertedtoAddressListingbeforeBlockCanvassingis conducted*BlocksareNOTincludedinBlockCanvassing,thePostalValidationCheck,ortheNewConstructionProgramSomeblocksthatareinTEA1containasignificantnum-beroflivingquarterswithnon-city-styleaddresses.TheseblocksshouldnotbeincludedinBlockCanvassing,whichisanoperationthatisdesignedtoconfirmandcorrectthe existenceand/orlocationofcity-styleaddresses.TheGeographyandFieldDivisionsareidentifyingBlockCan-vassingassignmentareas(AAs)thatlikelycontainblocks withsignificantnumbersofnon-city-styleaddresses.
SomeoftheseAAswillberemovedfromBlockCanvass-ing,andincludedinAddressListing.TheblocksintheseAAswillbeassignedaTEAcodeof9,andthecensus addresslistcompilationandcensusenumerationactivitiesinTEA9blockswillbevirtuallyidenticaltothoseinTEA2blocks(forinstance,theywillbeincludedinUpdate/Leave andNonresponseFollow-up).Becausemostoftheseblockshadfew,ifany,addressesintheMAFfromtheUSPS,theentitiestheblocksareinmostlyhadnothingtoreviewduringAddressListReview (LUCA)1998.Forthisreason,mostoftheseblockswill havetheirAddressListReviewedduringanewphaseofLUCA,oftencalledLUCA991/2.SectionIChapter33-25DesignoftheAccuracyandCoverageEvaluationSampleU.S.CensusBureau,Census2000 Chapter4.A.C.E.FieldandProcessingActivities INTRODUCTIONThischapterdescribestheoperationalaspectsoftheA.C.E.whichconsistedoffourmajoractivities:housingunitlisting,housingunitmatching,personinterviewing, andpersonmatching.Housingunitlistingandpersoninterviewswereconductedasfieldactivities,whereashousingunitmatchingandpersonmatchingwereprocess-ingactivitiescarriedoutintheNationalProcessingCenter(NPC)inJeffersonville,Indiana.Asdescribedearlier,allof theseactivitieswerecompletedpriortoestimation.Oncethesampleclusterswereselected,interviewersvisitedtheclustersandindependentlylistedallhousingunits.TheA.C.E.andcensushousingunitswerethenmatchedand, forthoseforwhichamatchwasnotfound,afollow-up interviewwasconductedtodeterminethestatusofthehousingunitatthetimeofthecensus.Followingtheresolutionofthehousingunitnonmatches,interviewswereconductedwithresidentsoftheA.C.E.
samplehousehold(Psample)toobtaintherosterofhouseholdresidentsandthedetailrequiredformatching.TheP-samplepersonswerethenmatchedtothelistof personsenumeratedinthecensusinthesampleclusters.Thesearchareawasexpandedtoincludeoneringofsur-roundingblocksforthoseclustersidentifiedascontaining potentialcensusgeocodingerrors.Thisoperationwascalledthetargetedextendedsearch(TES)becauseittar-getedclusterswithhighratesofA.C.E.housingunitnon-matchesandcensushousingunitgeocodingerror.Afur-therfollow-upinterviewwasconductedforselectedmismatchedpeopleforwhomadditionalinformationwasrequired.Basedontheseactivities,eachpersoninthe sampleclusters,whetherinterviewedintheA.C.E.sample(Psample)orfoundinthecensus(Esample)wasassignedafinalmatchstatuscode.ItisimportanttopointoutsomekeyimprovementsoftheA.C.E.2000operationsoverthe1990Post-EnumerationSurvey(PES).The2000A.C.E.improvedon1990PESinseveralwaysforinterviewingandclericalmatching.*Oneproblemin1990wasthemisreportingofCensusDayaddresses,withanestimated0.7percentofthePsamplebeingerroneouslyreportedasnonmovers(West1991).TheComputerAssistedPersonalInterview(CAPI) instrumentimprovedthequalityofthereportingof moverstatusbecauseitwasamoreautomatedprocess.In2000,theCensusDayhouseholdconsistedofnon-moversandoutmovers.ThenonmoverslivedinthehousingunitatthetimeoftheinterviewandonCensusDay.TheoutmoverslivedinthehousingunitonCensusDay,butmovedbeforetheA.C.E.interview.Nonmovers andoutmoversinthePsamplewerematchedtocensuspeopleintheirblockcluster.In1990,eachinmoverhousehold(thosethatmovedintoPESblockclusters afterCensusDay)hadtobematchedtoaCensusDayaddress,whichwasusuallyoutsidethecluster.In2000,thereconstructedCensusDayhouseholdwasmatched tothecensusenumerationsinthesampleblockcluster.*Astudyofclericalerrorinthe1990PESfounderrorincodingmatches(Davis1991)anderroneousenumera-tions(Davis1991b).In1990,codeswereenteredintoacomputersystem,buttheactualmatchingandduplicatesearchesweredoneusingpaper.Inthe2000A.C.E.,the matchingwasbettercontrolledandmoreefficientthan 1990becausetheclericalmatchingandqualityassur-ancewereautomatedandcodeddirectlyintotheauto-matedsystem.Theautomatedinteractivesystemdid notpreventallmatchingerror,butreducedthechancesforerrorsignificantly.Softwareallowedsearchingformatchesinthecensusbasedonfirstnames,lastnames, characteristics,andaddresses.Forexample,thesystemallowedsearchingforallpeoplenamedGeorge,allpeoplewhoselastnamebeginswithanH,allpeopleon ElmStreet,oreveryoneintheage30to40range.The softwarecontrolledthematchcodesthatwererelevanttothesituation.Forexample,onlyP-samplenonmatchcodescouldbeassignedtoaP-samplenonmatch.*Theelectronicsearchesforduplicatesreducedthetedioussearchingthroughpaperlistsofcensuspeople.Thesearchingin1990waslimitedtoprintoutsintwo sorts:lastnameandhouseholdbyaddress.In2000,theclerkshadthecapabilitytofilteronname,characteris-tics,andaddresstohelpidentifyduplicates.Thesystem monitoredwhetherthematcherhadcompletedallthe necessarysearches,suchaslookingforduplicates.*Therewerebuilt-ineditstoensureconsistencyofcod-ing.Forexample,codesthatappliedtoahousehold, suchasgeographiccodes,wereassignedtoallpeopleinthehousehold.Thesystemautomaticallyassignedcertaincodes,reducingcodingerror.*Clericalmatcherscoulduseacodeindicatingthecaseneededreviewatthenextlevelofmatching.Thiscodeallowedthemtoflagunusualcasestobeexaminedbya personwithmoreexperience.SectionIChapter44-1A.C.E.FieldandProcessingActivitiesU.S.CensusBureau,Census2000
- Allqualityassurancefortheclericalmatchingwasauto-mated.*ClericalmatchingwascentralizedattheNPCinsteadofhavingseparategroupsofmatchersintheseven processingoffices,aswasdonein1990.Forty-sixtech-nicianswerehiredinAugust,1999andwerethoroughly trainedinthedesignoftheA.C.E.andmethodsof matchingpeopleandhousingunits.Thesetechnicians wereresponsibleforqualityassuranceoftheclerical matchers.Additionally,tenanalystswhowereamong themostexperiencedmatchersconductedquality assuranceforthetechniciansandhandledthemost difficultcases.*Thecomputermatcheridentifiedmatchesandpossiblematcheswithinablockcluster.Additionalcomputerprogramswereusedtocheckthematchingoncasesafterthebeforefollow-upclericalmatchingtoidentify matchesandduplicatesintheexpandedsearchareathatwerenotidentifiedbytheclericalmatchers.Consis-tencycheckswerealsoperformedbetweenhousingunit andpersonmatchcodes.*Keyingerrorinthedatacaptureofthe1990PESwasreducedbecausethe2000interviewusedaCAPIinstru-ment.Amoreaccuratecaptureofthedataincreasedthe efficiencyofthecomputermatching.HOUSINGUNITLISTINGThefirststageofsamplingwastheselectionofA.C.E.blockclusters.Then,inSeptemberthroughDecemberof1999,alistingoftheaddressesofallthehousingunitsin theA.C.E.sampleclusterswasconducted.Thelistingwasindependentofthecensus.Traininginhowtolistbothcity-styleandnon-city-styleareaslasted3daysand includedareviewofthefirstcompletedclusterassigned toeachlister.Therewere29,136sampledblockclustersinthe50statesandtheDistrictofColumbia.ThislistofhousingunitsrecordedintheIndependentListingBooks (ILB)becametheframeofA.C.E.housingunitsfromwhichthePsamplewaslaterselected.Besideslistingeachhous-ingunitinthecluster,thelistersinquiredabouthousing unitspresentateachspecialplaceandcommercialstruc-ture.Thehousingunitlistingwasbybasicstreetaddress.Eachbasicstreetaddresswasassignedamapspotnumberand themapspotnumberwasrecordedontheA.C.E.maptoidentifythelocationofthebasicstreetaddress.Theaddressandcoveragequestionsaboutthestructurewere askedforeachbasicstreetaddress.Thenumberofhous-ingunitsatthebasicstreetaddresswasobtainedfromahouseholdmemberattheaddress,byproxy,fromthe apartmentmanager,orbyobservation.ThiscontacthelpedtoimprovethecoverageofhousingunitsintheA.C.E.Apageinthelistingbookforsingleandmultiunit structuresisshowninFigure4-1.Theindividualhousingunitswithinabasicstreetaddresswerelistedonthepagesofthelistingbookreservedformultiunits.Also,the A.C.E.listerrecordedthenumberofunitswithinabasic streetaddressonthemapinparenthesestoconformwith censusmethodology.Mobilehomesthatwerenotinmobilehomeparkswerelistedlikesingleunits.Eachmobilehomewasassignedauniquemapspotnumberandeachmobilehomewas listedonaseparatelineinthelistingbook.Ifthemobilehomeswereinapark,theparkwaslistedinthehousingunitsectionofthelistingbook,andeachindividualmobilehomeandvacantsitewaslistedinthemobilehomepark sectionofthelistingbook.Eachindividualmobilehomewasassignedauniquemapspotnumber,whetherthemobilehomewasinaparkornot.Thelocationofthe mobilehomewasidentifiedbyplacingthemapspotnum-berforthemobilehomeonthemap.Thiswasthesameprocedurethatwasusedinthecensus.Thefollowingitemswerecollectedandrecordedinthelistingbookforeachbasicstreetaddress:*City-styleaddresses(housenumberandstreetnames)
- Non-city-styleaddresses(routenumbers,routeandboxnumbers,oranyothertypeofaddressthatwasnotacity-styleaddress)*Householdernames(ruralareasonly)*Descriptionofaddresses(foronlynonhousenumberaddressesinbothurbanandruralareas)*Numberofhousingunitsinabasicstreetaddress*Typeofbasicstreetaddress(singleunit,multiunit,mobilehomenotinamobilehomepark,mobilehome inamobilehomepark,housingunitinspecialplace,multiunitinaspecialplace,orother)*Unitstatusforsingleunits(occupiedorintendedforoccupancy,underconstruction,futureconstruction,unfitforhabitation,boardedup,storageofhouseholdgoods,andother)Thefollowingitemswerealsocollectedandrecordedinthelistingbookforeachunitwithinamultiunitbasicstreetaddress:*Unitdesignation*Unitstatusformultiunits(occupiedorintendedforoccupancy,underconstruction,futureconstruction, unfitforhabitation,boardedup,storageofhouseholdgoods,andother)Thefollowingitemswerealsocollectedandrecordedinthelistingbookforeachmobilehomeinamobilehome park:*Housenumber,lotnumber,orphysicaldescription*Streetname4-2SectionIChapter4A.C.E.FieldandProcessingActivitiesU.S.CensusBureau,Census2000
- Ruraladdress*Unitstatus(intendedforoccupancy,unfitforhabitation,boardedup,storageofhouseholdgoods,vacanttrailer siteinamobilehomepark,andother)AfterthelistingbookswerereceivedinNPC,theywerecheckedinandthedatakeyedintoacomputerfile.Thekeyingqualityassurancewas100percent.Keyingrejects werereviewedclericallytocorrecterrorsbeforethematchingbegan.AdatafileoftheA.C.E.housingunitswascreatedtobeusedasinputtothehousingunit
matching.HOUSINGUNITMATCHThehousingunitmatchingconsistedoffoursteps:com-putermatching,clericalmatching,housingunitfollow-up,andafterfollow-upcoding.TheA.C.E.housingunitswere comparedtothecensushousingunitswithinclusterbycomputer,andthen,clerically.Housingunitsthatdidnotmatch,possiblematches,andpossibleduplicateswere followedupbyfieldinspectionandinterview.Theresultsofthefollow-upinterviewwererecordedduringtheafterfollow-upcoding.Thepurposeofhousingunitmatchingwastocreatealistofaddressesthatexistedashousingunitsintheblock clusteronCensusDaytouseintheP-sampleinterviewing.ThehousingunitlistingwasconductedintheFallof1999.Addressesthathadachancetobehousingunitson CensusDay,suchasunderconstruction,futureconstruc-tion,andvacanttrailersites,werelisted.Afterthehousingunitmatchingandfollow-up,onlythehousingunitsorigi-nallylistedandconfirmedtoexistashousingunitswere includedinthePsampleforCAPIinterviewing.Housingunitswithunresolvedstatuswerealsoincludedinthe interviewing.Computermatchingwasconductedafterthesecondphaseofsampling,whichconsistedofsamplereduction andsmallblocksubsampling.Theresultsofthecomputermatchingwerereviewedclerically.Allmatchingwascon-ductedwithinthesampleblockclusters.Thecensus addressesweretheonescontainedintheJanuary,2000versionoftheDecennialMasterAddressFile(DMAF).Thiswasnotthefinalversionoftheinventoryofcensus addresses,becauseoflateroperations.Theinventoryof censushousingunitswasfinalaftertheHundredPercentCensusUneditedFile(HCUF)wascompleted.Asnotedearlier,thePandEsampleswerelocatedinthesameblockclusters.TheadvantagesoflinkingtheA.C.E.
andcensushousingunitswere:*ThelinkofA.C.E.andcensusaddressesallowedanoverlappingPsampleandEsample,(i.e.,thehousingunitsselectedforthePsampleweremostlythesameasthoseintheEsample)eliminatingtheerrorpronecleri-calE-sampleidentificationrequiredtoachievetheover-lappingsamplesinthe1990PostEnumerationSurvey.*Thelinkingofaddressesalsoallowedthepersoninter-viewingtobeginearlieronthetelephoneusingthecen-sustelephonenumberforthecensusquestionnairereturnedbymail.Thetelephonenumberfromthecen-susquestionnairewasnotavailablewithoutthelinkbetweentheA.C.E.housingunitandthecensusques-tionnaireforthathousingunit.Aftersamplereductionofclusters,therewere11,303clus-tersinthe50statesandtheDistrictofColumbia.SeeChapter3foradiscussiononthesamplereduction.The 420clustersinlist/enumerateareaswerenotmatched,becausetheircensusaddresseswerenotavailableinthe Springof2000.Therefore,10,883clusterswerematchedinthehousingunitphase.Table1containsthenumberofhousingunitsandclustersinhousingunitmatchingfortheA.C.E.Thecensusnumberswerepreliminary;these weretheaddressespriortomailingthecensusquestion-naires.Subsequentcensusoperationsaddedandremovedaddressesfromthislist.Eventhoughthiscensuslistcon-tainsmorehousingunitsthantheA.C.E.,thiswasnotindicativeofcoveragedifferencesduetothepreliminarynatureofthecensusnumbers.SeeChapter3foradiscus-sionofthefinalP-sampleandE-samplehousingunitsand howtheycompare.Table4-1.SampleSizesfortheA.C.E.HousingUnitMatching Clusters Housing unitsClusterswithhousingunits
.............
10,157A.C.E.housingunits
..................
838,427Censushousingunits
.................
859,296Clusterswithouthousingunits...........726Totalclustersinhousingunitmatching
....10,883ComputerMatchThecensushousingunitsincludedontheDMAFinJanuary,2000,intheblockclustersretainedintheA.C.E.
aftersamplereductionandsmallblocksubsampling,were usedinthehousingunitmatching.ThehousingunitdatafromtheindependentlistingbookfileandtheDMAFextractwentthroughaseriesofdatapreparationsteps, includingaddressstandardization.Addressesfromeitherfilethatwereblankorcouldnotbestandardizedwerematchedclerically.Theresultsofthecomputermatching andimagesoftheA.C.E.andcensusmapswithmapspotsinruralareaswereinputsintoanautomatedreviewandcodingsoftwareforclericalmatching.ClericalMatchTheclericalmatchersusedtheresultsofthecomputermatchingtoaidintheirmatchingofaddressesfromtheSectionIChapter44-3A.C.E.FieldandProcessingActivitiesU.S.CensusBureau,Census2000 A.C.E.andthecensus.Therewere115clerks,46techni-cians,andtenanalystsinvolvedinthematchingopera-tion.Theclerkscarriedoutthematching.Thetechnicians appliedqualitycontroltothematchingperformedbythe clerks.Theanalystscarriedoutqualitycontrolonthe workofthetechnicians.Theclerksandtechniciansuseda reviewcodewhentheysawsomethingunusualorsome-thingthatshouldhavebeenlookedatbythenextlevelof matcher.Thetechniciansandanalystsexaminedthecases codedforreviewinthepreviousstageofmatching,in additiontocasesselectedforqualitycontrol.Theclerks usedinthehousingunitmatchingweregiven4weeksof training.ThetechnicianswerehiredinAugust,1999and givenextensivetrainingonthebackgroundofcoverage measurementandthedesignoftheA.C.E.allowingthem tomakemoreinformeddecisions.Theanalystswereour mostexperiencedpeople.Theanalystshaveworkedon coveragemeasurementformanyyearsandwerequite knowledgeableabouttheA.C.E.Thethreelevelsofstaff producedahighqualityofmatchingwithacost-efficient
operation.TheclericalmatchingwasconductedinthehousingunitmatchingphaseoftheA.C.E.onlyforclustersexpectedtobenefitfromfurtherexamination.Sinceclericalmatchingwaslaborintensive,theamountofclericalworkper-formedforthe2000A.C.E.wasreducedbyanautomatedidentificationofclustersforfollow-upinterviewingwith-outclericalreview.Theseclustershadonlyafewnon-matchesornonmatchesononlyoneside.Forexample,therecouldbe25A.C.E.nonmatchesandnocensusnon-matches,sotherewasnothingtheclericalmatcherscould do.Theclericalmatcherswerethusabletoconcentrateon themoredifficultclusterswherethereviewwasbenefi-cial.In2000,3,267clustersweresenttothefieldforthefollow-upphasewithoutclericalreview.Supplementalmaterialswereprovidedtofacilitatetheclericalmatching,suchasthemapswithspotstoidentifythelocationofA.C.E.andcensusaddressesinruralareas.TheA.C.E.andcensusaddressesthatcouldnotbe matchedbythecomputerwereidentifiedfortheclericalmatching.Thematchedaddresseswerenottargetedforreview,becauseexperienceinstudiespreparatorytothe 2000Censusindicatedaveryhighqualityofthematches assignedbythecomputer.However,clerkswereallowedtocorrectanyerrorsinthecomputermatchingthattheynoticed,whiletheywereattemptingtomatchthehousing unitsthatwerenotcomputermatched.Theclericalmatchersusedallhousingunitinformationavailabletomatchhousingunits.Theurbanareaswerealmosttotallycity-styleaddresses.Inruralareas,theaddressesweremoredifficulttomatch,mainlybecauseofthenon-city-styleaddresses.Thematchershadhouse-holdernamesandlocationdescriptionstohelpinmatch-ingtheA.C.E.andcensusaddressesinruralareas.The spottedmapsfortheA.C.E.andthecensuswerealsoused inthefinaldeterminationofwhichhousingunitsmatched inruralareas.ComputerimagesoftheA.C.E.andcensus spottedmapsthatwereusedinthehousingunitmatching wereaccessedviathematchingsoftwareandviewedon thescreen.Therewasalsoaclericalsearch,limitedtotheblockclus-ter,forduplicatehousingunitsduringthisphaseofthe matching.Thepossibleduplicateswerelinkedinthedata-baseforboththeA.C.E.andthecensus.Afollow-upinter-viewwasconductedtodetermineifthetwoaddressesreferredtothesamehousingunit.Onegoalforthe2000A.C.E.wasnottouseanypaperintheclericalmatching.Almostallmaterialsneededforclericalmatchingwereavailableonthecomputer.Paper-lessmatchingreducedthetimeneededforclericalmatch-ing,becausethetimespentwaitingforanassignmentandassociatedmaterialwaseliminated.TherewasthusnoneedforalargestafftomaintainanA.C.E.library.Paper mapswereavailabletouseforcaseswheretheimageofthemapwasnotavailableorwasnoteasytoviewinthesoftware.Thequalityassurancewasappliedasfollows:alloftheworkdonebyeachclericalmatcherwasreviewedinitially untilthematcherwasdeterminedtobeperformingatanacceptablelevelofquality.Thenumberofrecordstobereviewedbeforeaclericalmatcherwasclassifiedas acceptablewas200,afterwhichanacceptableclerkhad asystematicsampleofclustersreviewedforqualityassur-ance.Therewasacomputerrecordofthelevelofqualityofeachclerkswork.Iftheworkinthesampleofreviewed clustersfellbelowtheacceptablelevelofquality,allofthesubsequentworkofthatclerkwasreviewedbytechni-cians,untiltheclerkachievedanacceptablelevelofqual-ity,thensamplingwasresumed.Theanalystsperformed thesametypeofqualityassuranceonthetechnicians.Table4-2containstheresultsofbeforefollow-upclericalmatching.Thesenumbersincludeonlythehousingunitsinclustersthatwereprocessedinthehousingunitmatch-ing.Thelist/enumerateclustersarethereforenot included.TherelistedclustersdescribedattheendofthissectionarealsonotincludedinTable4-2.Thecensushadmorepossibleduplicatesandhousingunitsnotmatching thantheA.C.E.Thefollow-upinterviewresolvedthehous-ingunitstatusanddeterminedifthepossibleduplicateswereinfactduplicated.4-4SectionIChapter4A.C.E.FieldandProcessingActivitiesU.S.CensusBureau,Census2000 Table4-2.HousingUnitMatchingResultsBeforeFollow-Up InterviewingA.C.E.CensusHousingunitsPercentHousingunitsPercent Matched.....................681,38581.6681,38579.7Possiblematch
...............29,2313.529,2313.4Possibleduplicate............7350.15,7750.7Notmatched
.................123,46914.8138,65716.2RemovefromA.C.E...........100.0Total........................834,830100.0855,048100.0HousingUnitFollow-UpAllofthecasescodedasnotmatched,possiblymatched,orpossiblyduplicatedweresentforafollow-upinterview, regardlessofthetypeofbasicstreetaddresscode.Selectedmatchedcaseswerealsosenttofollow-uptocollectadditionalinformation.Specifically,thecases identifiedforfieldfollow-upwere:
- A.C.E.addresseswithabeforefollow-upcodeofnotmatched.Informationwasobtainedtodeterminewhethertheaddresseswerehousingunitswithinthesamplecluster.
- Censusaddresseswithabeforefollow-upcodeofnotmatched.Informationwasobtainedtodeterminewhethertheaddresseswerehousingunitswithinthe samplecluster.
- Possiblematches.ThepossiblematchesweresenttothefieldtodetermineiftheA.C.E.andcensusaddresses referredtothesamehousingunit.Iftheydidnot,theywereidentifiedasanA.C.E.nonmatchandacensusnonmatchduringthehousingunitfollow-upandinfor-mationwasobtainedtodeterminewhethertheaddresseswerehousingunitswithinthesamplecluster.
- Possiblecensusduplicates.Censushousingunitsthatwereidentifiedaspossibleduplicateswerefol-loweduptodetermineifthetwocensusaddresses referredtothesamehousingunit.
- PossibleA.C.E.duplicates.A.C.E.housingunitsthatwereidentifiedaspossibleduplicateswerefollowedup todetermineifthetwoA.C.E.addressesreferredtothesamehousingunit.
- Matchedhousingunitswithacodeofundercon-struction,futureconstruction,unfitforhabita-tion,vacanttrailersiteinamobilehomepark, other.Thesematcheswerefolloweduptodetermineiftheyfitthedefinitionofahousingunitatthetimeofthefollow-upinterview.AnA.C.E.housingunitwithunitstatusindicatingsome-thingotherthananoccupiedorvacanthousingunitthat wasintendedforoccupancyneededafollow-upinterviewtodetermineitsstatusatthetimeofthefollow-upinter-view.Theaddresswaseitherclassifiedasahousingunitorremovedfromfurtherprocessing.Forexample,aunitthatwasunderconstructionorfutureconstructionatthe timeoflistingmayhavefitthedefinitionofahousingunitatthetimeofthefollow-upinterview.Iftheunitfitthedefinitionofahousingunit,itwasincludedintheA.C.E.
housingunitprocessing.Ifconstructionhadnotpro-gressedenoughforittofitthedefinitionofahousingunit,itwascodedasremovedfromtheA.C.E.housing unitinventory.Thehousingunitfollow-upformswerecomputergener-ated.Thequestionsforhousingunitsrequiringa follow-upinterviewwereprinted.Inaddition,allhousingunitsintheblockclusterwereprintedforreference.ThequestionsfortheA.C.E.nonmatchesareinFigure4-2.The samequestionswereaskedforthecensusnonmatches.Thequestionsonthefollow-upformwerenotdesignedtobereadtorespondents,butwereintendedtobeusedasaguideforaninterviewer.Indeed,manyquestionswereansweredbyobservation.Theanswertoonequestion mayhavebeentheresultofaskingseveralotherques-tions.Thefollow-upinterviewerappropriatelymodifiedthequestions,whennecessary,tothesituationthatwasencounteredinthefieldandrecordedtheappropriate answersonthefollow-upform.Thisapproachwasadoptedbecausethereweremanysituationsthatcouldoccurandaformtocovereverypossiblesituationwould becumbersometohandle.Itwasnecessarytofindoutifthehousingunitsatisfiedthecensushousingunitdefini-tionatthetimeofthefollow-upinterview.Therewasno attempttogatherinformationaboutreasonsforbeing somethingotherthanahousingunit.Forexample,thefollow-upinterviewerdeterminediftheaddressforanA.C.E.independentlistingnonmatchoracensusnonmatchexistedasahousingunit.Thiswasnotaquestionmeantforarespondent.Therewereseveral reasonswhyanaddressmightnotfitthedefinitionofahousingunit,suchasitburned,itwasamobilehomethatmoved,itwasconvertedtofewerhousingunits,itwas groupquarters,itwasusedforstorageoffarmmachinery, itwasthelaundryroominanapartmentcomplex,itwasaSectionIChapter44-5A.C.E.FieldandProcessingActivitiesU.S.CensusBureau,Census2000 Table4-3.AfterFollow-UpHousingUnitMatchingResultsA.C.E.CensusHousingunitsPercentHousingunitsPercent Matched.................................................719,01386.1719,01384.1Notmatched,butexistedintheblockcluster.
...............76,4189.228,8743.4Didnotexistasahousingunit
.............................30,7703.748,6845.7Geocodedoutsidethecluster
..............................6,3160.845,0535.3 Duplicate................................................1,1570.112,2961.4 Unresolved
..............................................1,1560.11,1280.1Total....................................................834,830100.0855,048100.0business,andsoforth.Theinterviewerappropriatelymodifiedthequestions,asnecessary,tothesituationthatwasencounteredinthefield.Furthermore,theinterviewercouldidentifymatchesorduplicatesinthefieldthathad notbeenidentifiedintheclericalmatching.AdditionalmatcheswerealsoidentifiedbetweentheA.C.E.andcensusaddressesduringthefollow-upinter-view,whentheinterviewerrealizedthetwodifferent addressesintheA.C.E.andcensusreferredtothesame unit.Correctionsandupdatestotheaddresseswerealsorecordedonthefollow-upform.TheaddressupdateswerekeyedintothedatabasetoaccuratelyidentifyA.C.E.
housingunitsforthepersoninterviewing.Thefollow-upinterviewerswereinstructednottoaddhousingunitsmissedbyboththeA.C.E.andcensusforthe2000A.C.E.AfterFollow-UpCodingAfterthefieldfollow-up,thecompletedformswerereturnedtotheprocessingoffice.Usingtheinformation obtainedduringthefieldwork,anafterfollow-upmatchcodewasassignedbytheclericalmatchersforcasessenttothefield.Thetechniciansandanalystsreviewedthe clusterscontaininghousingunitswithareviewcodeandcarriedoutqualityassurancefortheclustersprocessedintheafterfollow-uphousingunitmatching.Thefollow-upformswerereviewedclericallyandcodeswereassignedtotheA.C.E.andcensushousingunits.Table4-3provideshousingunitmatchingresultsforallA.C.E.andcensushousingunitsafterthefollow-upinter-viewcodeswereassigned.A.C.E.housingunitsclassifiedasexistingintheblockclusterandhousingunitswithunresolvedhousingunitstatuswereeligibleforperson interviewing.Thisincludedbothmatchedandnotmatchedunits.A.C.E.addressesclassifiedasnothousingunits,duplicates,andgeocodingerrorswereremoved fromtheA.C.E.universe,andtherefore,werenoteligible forpersoninterviewing.ThenumbersinTable4-4aretheA.C.E.housingunitsthatwereeligibleforpersoninter-viewingbeforesamplereduction.Thesenumbersdonot includetherelistedclustersandclustersinlist/enumerateareas.Censushousingunitswithcodesofnotmatchedandunresolvedstatuseswerenoteligibletobeincluded inthePsampleforinterviewingbecausetheywerenot listedintheA.C.E.independentlisting.Table4-4.A.C.E.HousingUnitsEligibleforPersonInterviewing A.C.E.HousingunitsPercent Matched..............................719,01390.3Notmatched,butexistedintheblock cluster..............................76,4189.6 Unresolved
...........................1,1560.1Total.................................796,587100.0RelistingforClusterswithA.C.E.GeocodingErrorsThefollow-upoperationalsoexaminedpotentialgeocod-ingerrorsintheoriginalA.C.E.housingunitlistings.Ifa largeproportionoftheA.C.E.housingunitsintheclusterhadwronggeocodes,theclusterwasrelisted.Clusterswereidentifiedforrelistingwhentheafterfollow-upcod-ingdescribedintheprevioussectionwascompleted.The decisiontorelistwasautomated.If80percentofthehousingunitsinaclusterhadgeocodingerror,theclusterwasrelisted.Therewere62relistedclustersinthe50 statesandtheDistrictofColumbia.Thefieldlisterforrelistedclustershadnopreviouscontactwiththiscluster.Therelistingoperationwascarriedoutindependentlyofthelistofcensushousingunits.Toassureindependence,theA.C.E.housingunitlistings(boththeoriginallistingandtherelisting)weredonewithouttheA.C.E.listersee-ingthecensusinventoryofhousingunits.Therewasnohousingunitmatchingintherelistedclus-tersduringthehousingunitmatchingphaseofA.C.E.The addresseslistedforA.C.E.duringtherelistingoperationweretheaddressesusedtoconductpersoninterviewing.Theseclustersweretreatedinthesamewayasthe list/enumerateclustersin2000.AnunresolvedcodewasassignedtoalloftheA.C.E.hous-ingunitsintherelistedclustersandinthelist/enumerateclusters.Thecensushousingunitsintheseclusterswere assignedablankhousingunitcode.PERSONINTERVIEWPriortopersoninterviewingtherewasanotherstageofsampling,thewithinblocksubsamplingoflargeblock4-6SectionIChapter4A.C.E.FieldandProcessingActivitiesU.S.CensusBureau,Census2000 clusters.SeeChapter3formoredetails.TheresultinghousingunitsfromA.C.E.comprisedtheP-samplehousing unitsassignedforinterviewing.Therewere11,303clus-tersselectedforinterviewing,andtheycontained300,913 P-samplehousingunits.Thepersoninterviewtraining lastedfivedays.A.C.E.moverandresidencestatuscodesnecessarytoidentifyP-samplepeoplefromthepersonintervieware assignedwithintheinterviewinstrument.ThesecodesaredescribedinFigure4-3.ThegoaloftheinterviewwastoobtainahouseholdrosterforeveryonelivingatthehousingunitatthetimeoftheinterviewandonCensusDay,April1,2000.ProcedureCwasusedforthe2000A.C.E.WithProcedureC,each A.C.E.personwasassignedanA.C.E.movercode,anA.C.E.bornsinceCensusDaycode,anA.C.E.groupquar-terscode,andanA.C.E.otherresidencecode.TheA.C.E.
statuscodecombinedalloftheinformationfromthesecodestoidentifythepeopleforwhommatchingwasnec-essary.Attachment1containsthedefinitionsforcodesfor themovers,thosebornsinceCensusDay,membersof groupquarters,otherresidencecode,andtheA.C.E.statuscode.SeetheChapter7attachmentformoreonProcedureC.GroupquarterswerenotlistedintheA.C.E.andA.C.E.interviewswerenotconductedingroupquarters.SeeAttachment2foradiscussionofthetreatmentofgroup quartersinA.C.E.ModeofInterviewTheA.C.E.personinterviewwasconductedusingaCAPIinstrumentonlaptopcomputers.Attachment3containsadescriptionoftheproceduresfollowedintheperson interview.Somepersoninterviewswereconductedbytele-phoneandsomebypersonalvisit.Togetanearlystartfortheinterviewing,atelephoneinterviewwasconductedathouseholdswherethecensusquestionnaireincludedatelephonenumberandwasreceivedatacensusprocessingofficeearlyenoughfor computerprocessing,beforethestartofpersoninterview-ing.Thetelephonenumbercamefromthecensusques-tionnaireofthematchingcensushousingunit.Thepersoninterviewsconductedbytelephonewereconductedfrom April24,2000untilJune13,2000.SeeByrneetal.(2001)formoredetails.Atotalof88,573interviewsor29.4per-centofthetotalworkloadwereconductedbytelephone.ThefollowingcaseswereexcludedfromtheA.C.E.tele-phoneinterviewing:*Housingunitsincensuslargehouseholdandcensuscoverageeditfollow-up*Questionnairesthatwerenotreturnedbymail*Housingunitswithouthousenumberandstreetnameaddresses*Housingunitsinsmallmultiunitstructures(i.e.,lessthan20units)Largemultiunitswereabletobeincludedinthetelephoneinterviewing,becausetheytendedtohaveuniqueunit designations.Manysmallmultiunitstructuresandruralareasdidnothaveaddressesthatallowthetelephoneinterviewertodistinctlyidentifytheaddress.Sincethere wasnohousingunitmatchinginrelistedandlist/enumerateclusters,allpersoninterviewinginrelistedandlist/enumerateclusterswasbypersonalvisit.Allremaininginterviewsaftertheendofthetelephoneoperationwereconductedinperson,exceptforsomenon-responseconversionoperation(NRCO)interviewsand interviewsingatedcommunitiesorsecuredbuildings.Thepersoninterviewsconductedbypersonalvisitwerecon-ductedfromJune18untilSeptember11,2000.Crew leadersandsupervisorsconductedtelephoneinterviewstogivethemexperienceininterviewing.Table4-5containsthenumberofinterviewers,crewlead-ers,andsupervisorsusedduringproductioninterviewingandduringtheinterviewingforpersonfollow-upaftertheclericalmatching.Table4-5.FieldInterviewPersonnelTelephone interview Personal interview Person follow-up Interviewers....................4504,5024,470Crewleaders...................794836712 Supervisors....................189186184Forthefirst3weeksofinterviewing,thepersoninterviewwasconductedonlywithahouseholdmember.Ifaninter-viewwithahouseholdmembercouldnotbecarriedout within3weeks,aninterviewwithaknowledgeablenon-householdmemberwasattempted,calledaproxyinter-view.Theproxyinterviewingwasallowedduringthe remainderoftheinterviewingperiod.Duringthelast2 weeksofinterviewingforacluster,anonresponseconver-sionoperationwasconductedforthenoninterviewsusingthebestinterviewers.Thisnoninterviewconversion attemptedtoobtainaninterviewwithahouseholdmem-beroraknowledgeableproxyrespondent,butnotalastresortinterview 1.Thenonresponseconversionoperationconverted9,518ofthe9,735totalnoninterviewstointer-
views.1Lastresortinterviewswereoneswithminimalinformation,suchasnameslikeWhiteFemale.Thelastresortinterviewisusuallynotfromaknowledgeableproxyrespondent.Lastresortinterviewswereconductedinthecensusattheendofnonre-sponsefollow-up,afterallattemptstocontactaknowledgeable respondenthavenotobtainedaninterview.Lastresortinterviews werenotconductedforA.C.E.SectionIChapter44-7A.C.E.FieldandProcessingActivitiesU.S.CensusBureau,Census2000 TheQuestionnaireTherewerethreepathsorsectionswithinthepersoninter-view.Aninterviewwasconductedusingthefirsttwo paths,whenatleastoneofthehouseholdmembers,for whominformationwasrequired,currentlylivedatthe housingunitwhentheinterviewwasconducted.Onepath collecteddatafromahouseholdmember,andanother pathcollecteddatafromanonhouseholdmember(i.e.,
proxyrespondent)forthesepeople.Thereweretwo paths,becausethequestionswerewordeddifferentlyfor interviewswithhouseholdmembersandwithproxy respondents.Theinterviewsfromthefirsttwopathswere inhousingunitscontaining:*Wholehouseholdnonmovers
- Wholehouseholdinmovers
- Householdswithamixtureofnonmovers,inmovers,andoutmoversThethirdpathwasforwholehouseholdoutmovers.Thedataforoutmoverswasobtainedbyproxywiththecur-rentresidentinthesamplehouseholdorwithotherproxyrespondents,whennecessary.Whentherewasaninter-viewwithwholehouseholdinmovers,therewasalsoan interviewusingthethirdpathforwholehousehold outmovers.Whenthereweremultipleinterviewsforthesamehousingunit,theCAPIdatafromthelastinterviewwasselectedforprocessing.Iftherewasalsoaqualityassuranceinterviewthatreplacedtheoriginalinterview,thequalityassurance interviewwasselectedoveranyotherinterview.Aftertheinterviewersobtainedthenamesandcharacteris-ticsofhouseholdmembers,theyestablishedtheresidencestatusonCensusDay.Fornonmoversandoutmovers, moverstatusinadditiontoquestionsaboutgroupquar-tersandotherresidencesonCensusDayestablishedtheresidencestatus.CollegestudentslivingelsewhereindormitorieswerenotpartoftheA.C.E.universe.However,theywereinadvert-entlyincludedasinmoversintheA.C.E.instrument.Tocorrectforthis,aneditwasperformedforpartialhouse-holdinmoverswhowereingroupquartersonCensusDay.
IftheinmoverwasingroupquartersonCensusDayand wasbetweentheagesof18and22,inclusive,theinmoverwasgivenanA.C.E.statuscodeofremoved.QualityAssuranceofPersonInterviewingThequalityassuranceplanfortheA.C.E.PersonInterviewoperationconsistedofareinterviewofasampleoftheoriginalA.C.E.interviews.Theworkloadconsistedofapreselectedrandomsampleof5percentofthetotalper-soninterviewcaseloadandanothersampleconsistingofcasestargetedbythesupervisorsintheregionalofficesusingspeciallydesignedtargetingreports.Thetargetingwasbasedonvariousindicatorslikelytopredictpoordataqualityorpotentialfabrication.Thetargetedsamplewas another5percentofthetotalworkload.AseparateCAPIquestionnairewasdesignedforthequal-ityassuranceinterviews.Thequalityassurancequestion-nairecontainedseparatepathsfortelephoneandpersonalvisitqualityassuranceinterviews.Thequestionnairealsoincludedacompleteversionoftheoriginalinterviewto allowqualityassuranceinterviewerstoconductthehouseholdinterviewoncasessuspectedoffabrication.
Consequently,itwasnotnecessarytoassignanotherfieldrepresentativeatalaterdatetoconductthehouseholdinterviewsforsuchcases.Qualityassuranceinterviewswereconductedeitherbytelephoneorpersonalvisit.Theinterviewdetermined whetherornottheoriginalrespondentwascontactedbyaninterviewer.If,afteraninitialsetofquestions,itappearedthattherespondenthadnotbeenpreviously contacted,thequalityassuranceinterviewcontinuedwithafullhouseholdinterviewthatreplacedtheoriginalinter-viewinallfutureprocessing.Thequalityassuranceplancenteredonwhethertheorigi-nalintervieweractuallycontactedthepersonwhowasreportedtohavebeeninterviewed.Whenthiswasthecase,theinterviewitselfwasassumedtobecorrect because,thepersoninterviewquestionnairewasdesignedtoensuredataqualityusingdataeditsandautomatedquestionnaireskippatterns.Whenthiswasnotthecase (i.e.,theproperhouseholdwasnotcontacted),afullrein-terviewwasconducted.Thequalityassuranceplanwasdesignedtobemosteffec-tiveforthefewinterviewerswhoblatantlyincludedata fromfictitiousinterviews.Thisoccursinpracticeinsimilar surveys.Therefore,discrepantresultsweretargetedbylookingforinconsistentorconspicuousresultsidentifiedusingthetargetingreports.Examplesofinconsistentor conspicuousresultsincludeusingthesamenameforrespondentsacrosscases,usingfamousnamesforhouse-holdmembers,orcompletingcasestoolateinthedayto reallyhavebeeninterviewingatsomeoneshouse.Effectivelyidentifyinganinterviewerwithonlyoneortwoerrorsinalargeworkloadofcaseswouldrequireapro-hibitivelylargerandomsample.Because,laterA.C.E.
operationssuchasthepersonfollow-upinterviewwereexpectedtoidentifysuchcases,thequalityassuranceplandidnotattempttoidentifythesesituationsbeyond whatfallsinthe5percentrandomsample.PreliminaryEstimationOutcomeCodesPreliminaryP-sampleestimationoutcomecodeswereassignedtoeachP-samplehousingunitbeforethecom-puterandclericalmatching.Thisoutcomecodewas4-8SectionIChapter4A.C.E.FieldandProcessingActivitiesU.S.CensusBureau,Census2000 assignedtothehousingunitbasedonCensusDayfornonmoversandoutmovers.Onlypeoplewiththe followingA.C.E.statuscodeswereusedinthematching
operations:*N=nonmoverresident*O=outmoverresident
- U=unresolvedresidencestatusThepreliminaryestimationoutcomecodesidentifiedinterviewsandnoninterviewsinoccupiedhousingunits,vacanthousingunits,andhousingunitsthatwereremovedfromthePsample.Theinterviewoutcomes describedinthissectionwereCensusDayinterviewoutcomesafterdataediting,whichconvertswholehouse-holdsofCensusDayresidentswithinsufficientinforma-tionformatchingtononinterviewsandwholehouseholdsofCensusDayresidents,whoshouldnothavebeencountedatthehousingunitonCensusDaytovacant housingunits.
Interviews*Completeinterviews.Interviewsconductedwithahouseholdmember.*Proxyinterviews.Interviewsconductedwithsomeoneoutsidethehousehold.*Sufficientpartialinterviews.Interviewswithhouseholdmembersorproxiesthatdidnotcollectallrequired data,butdidcollectenoughinformationtobeconsid-eredasinterviews.
Noninterviews*Fieldnoninterview.
- Wholehouseholdsofpeoplewithinsufficientinforma-tiontopermitmatchingandfollow-up.VacantonCensusDay
- HousingunitsidentifiedasvacantonCensusDaybytheinterviewer.*WholehouseholdsofpeoplewhoshouldhavebeencountedelsewhereonCensusDay(i.e.,wholehouse-holdnonresidents).NotaHousingUnitonCensusDay
- ThehousingunitsidentifiedduringthepersoninterviewasnotahousingunitonCensusDaywereremovedfromthePsample.Table4-6containsthenumberofeachcategoryofprelimi-naryoutcomecodesandthenumberandpercentagesof totaloccupiedandvacanthousingunitsfortheprelimi-naryoutcomecodesgroupedintointerview,noninterview,andvacant.Thepercentagesofinterviewandnoninter-viewforoccupiedhousingunitswerealsoincluded.Thenoninterviewrateforoccupiedhousingunitswas1.9per-centbasedonthepreliminaryoutcomecodesbefore clericalmatching.Theinterviewersidentified10,206 addressesor3.4percentoftheA.C.E.addressesasnotbeinghousingunitsonCensusDay.TheA.C.E.housingunitsidentifiedassomethingotherthanhousingunits werenotinthePsample.FormoredetailsseeChildersetal.(2001).Table4-6.PreliminaryCensusDayEstimationOutcomeforA.C.E.HousingUnits(Unweighted)OutcomecodeTotalhousingunitsOccupiedhousingunitsNumberPercentNumberPercent Interview............................................................257,62488.6257,62498.1Completeinterviewwithahouseholdmember.
........................
235,632Completeinterviewwithaproxyrespondent
..........................
19,380Sufficientpartialinterview
...........................................
2,612 Noninterview
........................................................4,9881.74,9881.9Fieldnoninterview
.................................................
2,667Allpeoplehaveinsufficientinformationformatchingandfollow-up
.......2,321Totaloccupiedhousingunits
...........................................262,612100.0Vacant..............................................................28,0959.7NoCensusDayresidents
...........................................
4,184VacantonCensusDay
.............................................23,911Totaloccupiedandvacanthousingunits
................................290,707100.0NotahousingunitonCensusDay
.....................................
10,206Totalinterviewedhousingunits
........................................
300,913SectionIChapter44-9A.C.E.FieldandProcessingActivitiesU.S.CensusBureau,Census2000 Thepercentnoninterviewwascalculatedfortheunweightednumbersofnoninterviewsdividedbythe occupiedinterviews,whichwastheinterviewsplusthe noninterviews.Tablesofpreliminarynoninterviewrates arepresentedforrespondenttypeandinterviewmodein Tables4-7and4-8.Table4-7.P-SamplePreliminaryPercentNoninterviewinBeforeFollow-Up byRespondentTypeRespondenttypeP-samplepreliminarypercentnoninterviewHouseholdmember.......................0.9Proxy...................................
13.8Total....................................1.9Ofallinterviewsatoccupiedhousingunits,33.5percentwerecompletedbytelephone,66.1percentwerecom-pletedbypersonalvisit,and0.3percent,whichwas910 interviews,werecompletedbyaqualityassurancereplacementinterview.Thepercentnoninterviewofoccu-piedhousingunitsforeachinterviewmodeisshownin Table4-8.Table4-8.P-SamplePreliminaryPercentNoninterviewBeforeFollow-Upby InterviewModeInterviewmodeP-samplepreliminarypercentnoninterviewTelephone...............................0.9Personalvisit.............................2.2Qualityassurancereplacement
.............
36.0Total....................................1.9Whiletelephoneinterviewsweremorelikelythanpersonalvisitinterviewstohaveinsufficientinformationbecause neighborscouldnotbecontacted,thiswasoffsetbythestraightforwardnatureofthetelephoneinterviews.Thesewerecaseswheretherespondentcompletedandreturned thecensusforminatimelymannerandprovidedatele-phonenumberontheform.Conversely,personalvisitcasestendedtobethemoredifficultsituations(suchas moversorreluctantrespondents),andweretherefore, muchmorelikelytoresultinnoninterviews.Therewereseveralreasonsforahighnoninterviewrateforthequalityassurancereplacementinterviews.These weredifficultinterviews,becausetheyfailedthequality assurancecheckandneededareinterview.Manyofthenoninterviewswererefusals.Additionally,becausetheinstrumentwasmonitoringboththequalityassurance caseandthereplacementinterview,itwasdifficulttoobtaintheCensusDayresidentsinmovercasessothatmanyofthesewerenoninterviews.Therewasalsoaprob-lemwiththeinstrumentincaseswherethequalityassur-anceinterviewercouldnotfindtheaddressonthedayoftheQAinterview.Whenthisoccurred,thecasefailedthequalityassurancecheck,butnodatawerecollectedtoreplacetheoriginalinterviewsincetheQAinterviewer couldnotfindtheaddress.However,unlikeinpersonal visitcases,noattemptwasmadebytheQAinterviewerto determineifthesampleaddressalsodidnotexistonCen-susDay.Therefore,thesecaseswereconsideredtobe CensusDaynoninterviews.Therewere108suchcases.PERSONMATCHINGAfterboththeCAPIinterviewingandtheHCUFwerecom-pleted,theEsamplewasidentifiedfromtheHCUFandpersonmatchingbegan.PeoplewithincompletenameswereidentifiedbycomputerforboththePandEsample, becausetheydidnotcontainsufficientinformationformatchingandfollow-up.SeeAttachment4formoreinfor-mationaboutcensusdata-definedandinsufficientinfor-mationformatchingandfollow-up.TheP-samplepeopleandthoseintheHCUF,withinthesampleclusters,werecomputermatched.Thepossiblematches,P-samplenonmatches,andE-samplenonmatches wereclericallyreviewedusinganautomatedmatchingandreviewsystem.Additionalmatchesandpossiblematcheswereidentifiedbytheclericalstaff.Duplicatesonboth listswerealsoidentifiedclerically.Afterthematchingwas completed,fieldfollow-upwasconductedandtheresultsofthefieldinterviewwerecodedinthematchingdata-base.WithinBlockClusterComputerandClerical MatchingWithprocedureC,thepeopleinP-samplehousingunits,whowereinitiallymatchedtotheE-sampleand non-E-samplecensusenumerationswere:*nonmoversandoutmoversidentifiedasresidents(i.e.,A.C.E.statusequaltoNandO),or*peoplewithunresolvedresidencestatus(i.e.,A.C.E.statusequaltoU)Thematchingwithinthesampleclusterswasdonebythecomputermatcherfollowedbyacomputerassistedcleri-calreview.ThecomputercomparedthenonmoversandoutmoverstotheE-samplecensusenumerationsinsampleclustersandwhennecessarytothenon-E-sampleenu-merations.Thesenon-E-sampleenumerationswerecensus peopleinhousingunitsthatwerenotincludedintheEsampleafterthesubsamplingofcensushousingunits.Theclericalmatchersalsosearchedamongpeopleenu-meratedinthecensusingroupquarters.AmatchwasassignedwhenthenameandcharacteristicsinthePsampleforapersonwerefoundinthecensusdatawithin theblockcluster.Duringcomputermatching,thePsamplewasmatchedtothecensus.However,thismatchingwasprioritized;first4-10SectionIChapter4A.C.E.FieldandProcessingActivitiesU.S.CensusBureau,Census2000 thePsamplewasmatchedtotheEsample,thenanyleft-overnonmatchesfromthePsamplewerematchedtothe non-E-samplepeopleinhousingunits.Thematching occurredintwosteps:
- RecordPairRanking.ThestandardizednamesfromtheP-samplepersonandthecensuspersonwerecom-paredalongwiththepersoncharacteristicsusinga stringcomparison(Winkler,1994).Arankingscorewasassignedtoeachpairofpeopleandtheoptimalpairs wereidentified.
- DeterminationofMatchCutoffs.Theoptimalpairsintheclusterwerereviewedtodeterminethecutoffsformatchesandnonmatches.Allpairsabovethematch cutoffwereidentifiedasamatch.Allpairsbetweenthematchcutoffandnonmatchcutoffwereidentifiedaspossiblematches.Allpairsbelowthenonmatchcutoff wereclassifiedasnotmatched.Matchcutoffswereassignedconservativelytopreventfalsematches.Thegoalofthematchingandfollow-upoperationwastoproducethecorrectratioofcasesclassifiedasomittedfromthecensustothoseclassifiedascorrectlyincludedin thecensus.Afterthecomputermatching,P-sampleandE-samplepeoplewhodidnotmatchwerereviewedcleri-cally.Theclericalmatcherswereabletomatchpeoplethe computercouldnot,becausetheyhadthewholehouse-holdtoaidinmatching.TheP-samplenonmatchesweresearchedforinthecensus.Aduplicatesearchwasalsoconductedclerically.Thematchingandduplicatesearch wasaidedbythesoftwareinsortingandsearchingthecensusrecords.ThecomputerassistedclericalmatchingsoftwarecontainedallA.C.E.andcensusinformation aboutP-sampleandcensuspeople,includingnames,char-acteristics,outcomeoftheinterview,andaddress.TheA.C.E.technicianscarriedoutthequalityassurancefortheclericalmatchersandresolvedthecasesflaggedbytheclericalmatchersasneedingfurtherreview.TheA.C.E.analystsdidthequalityassuranceforthetechniciansand resolvedthecasesflaggedbythetechniciansasneeding furtherreview.Therewere235clerks,46technicians,and10analyststodotheclericalmatching.CensusImages.Scannedimagesofcensusquestion-naireswereavailableformatchingforthefirsttimein Census2000.Theclericalmatchersusedtheseimagesasanaidinmatchingandwhenadditionalinformation(likenames)wasfound,thenewinformationwasmadeavail-ableforthefollow-upinterview.AnE-samplerecordcouldbeupdatedbytheclerkstoprovidesufficientinformationformatchingandfollow-uportocorrectimagecapture errors.Inaddition,someinformationwrittenoutsidethecaptureboxeswasusedtoupdatethedata.ForCensus2000,allcensusformswerescannedandthesubsequentinformationwasinterpretedusingOptical MarkRecognitionandOpticalCharacterRecognitionorwaskeyed.Forpersonmatching,imageswereonlyavail-ableforhousingunitsontheJanuary,2000DMAF.Images werenotavailableforcensushousingunitsaddedafter January,2000.Anaddressidentifiedbycensusidentifica-tionnumber(ID)couldreturnmorethanoneform,includ-ingthefollowing:originalcensusform,BeCountedform, aforeignlanguageform,and/oraSimplifiedEnumerator Questionnaire.BeCountedformswerenotavailabletouse forviewingimages,sincetheydidnothaveacensusID associatedwiththeformwhendatacaptured.Theclericalmatchersrevieweddataforcensuspeoplewithinsufficientinformationformatchingandfollow-upandsearchedforadditionalinformationthatmightallowthemtobematchedwhentheimagewasavailable.All reviewofcensuspeoplewithinsufficientinformationfor matchingandfollow-upwasdonebeforetheclericalmatchingbeganandthecensusdatainthematchingsoft-warewasupdated.Thesoftwaredidnotpermitthe assignmentofacodeuntilthereweretwocharacteristicsandacompletename.Afterthesoftwaredatawereupdated,theclericalmatchingprocessbeganandthe matcherscouldmatchtheP-samplepersontothecensuspeoplenowcontainingsufficientinformationformatch-ing.Thematcherswerealsoabletoreviewdatafornon-matcheswhentheysuspecteddatacaptureerrorsandtocorrecttherecordsofname,relationshiptopersonnum-berone,sex,age,Hispanicorigin,andrace.Thecorrecteddatawereusedonthefollow-upform,butnotsenttoestimation.TheupdateddatawerenotinsertedintotheHCUF.Thisupdatingwasformatchingin A.C.E.andforthefollow-upformonly.Thematcherswere
NOTlookingatpeoplewhowerenotdata-definedtoseeiftherewasmoreinformationonthecensusformtomakethemdata-defined.Therefore,peoplewere NOTcreatedinthecensus.DuplicateSearchWithinCluster.Thesearchfordupli-cateswasdoneclerically.Apersonwasduplicatedwhenthedatacollectedforthepersonwasrepeatedwithintheblockcluster.Theprintoutsusedin1990forduplicate searchwereautomatedin2000.Searchroutinesinthe2000clericalmatchingsoftwaremadethesearchesquickerandmoreaccurate.Duplicateswerelinkedinthe matchingsystemforlateranalysis.DuplicatedPeopleWereIdentified:
- WithinthePsample.AduplicatedP-samplepersonwasremovedfromthefinalPsample,becauseboth peoplewerenotneededinthathouseholdinthePsample.WhenthewholehouseholdsofP-samplepeoplewereduplicated,oneofthehousingunitswascon-vertedtoanoninterviewbecausetheinterviewwasnotagoodone.TheduplicatedP-samplehouseholdwasinadifferenthousingunitandoneofthemwasincluded insteadofthepeoplewhoactuallylivedattheaddress.SectionIChapter44-11A.C.E.FieldandProcessingActivitiesU.S.CensusBureau,Census2000 Forexample,theSmithfamilywascollectedinapart-mentsAandB.Bothapartmentswerehousingunits.
TheP-sampleinterviewfortheduplicatedfamilyisnota goodinterviewandisconvertedtoanoninterviewafter theP-samplepeoplewereremoved.
- WithintheEsample.AnE-samplepersonduplicatewasanerroneousenumerationinthecensus.
- BetweenE-samplepeopleandpeoplenotintheE sample.TheE-samplepeoplewerealsocomparedtothecensuspeopleinhousingunitswithinthesamesampleclusterwhowerenotinsampleinlargeblock clustersaftertheE-sampleidentification.TherewasnoduplicatesearchbetweenE-samplepeopleandpeopleenumeratedingroupquarters.Also,therewasnodupli-catesearchwithingroupquarters.WhenduplicationbetweenanE-samplepersonandanon-E-samplepersonwasidentified,itindicatedthat therewasnotafullerroneousenumeration.Therefore,theprobabilityoferroneousenumerationcausedbyduplicationwasneededfortheduplicatedE-sampleper-son.Theformulafortheprobabilityoferroneousenu-meration,was100times ddividedby c+d+1percentor P r (EE)100d/(c+d+1)percentwhere c=numberoftimestheE-samplepersonwasduplicatedwithanotherE-sampleperson d=numberoftimestheE-samplepersonwasduplicatedwithanon-E-samplepersonIn1990,whentherewasduplicationbetweenapersonintheEsampleandapersoninahouseholdthatwasnotin thelarge-clustersubsample,andthereforenotintheE sample,theE-samplepersonwasassignedaprobabilityoferroneousenumerationofonehalf.Thismethodologywasrefinedinthe2000A.C.E.toaccommodatetriplicates.The 1990estimatewasbiasedwhentherewasatriplicateenu-merationinthecensusandthistriplicateinvolvedtwoE-sampleduplicatesandthetriplicatewasnotintheE sample.However,therewereonlyafewofthesecasesin 2000.ThisassumestheE-samplepersonhadbeencodedascor-rectlyenumerated.IftheE-samplepersonwascodedunre-solved,thefinalprobabilityoferroneousenumeration includedanimputationforunresolvedenumerationstatus.IftheE-samplepersonwasassignedamatchcodethatindicatederroneousenumeration,thenumberoftimes thattheE-samplepersonwasduplicatedwithnon-E-samplepeoplewasirrelevantandignored.Apersoncouldnothaveaprobabilityoferroneousenumerationthatwas largerthan100percent.CensusGeocodingErrorsTheclericalmatchersreviewedpeopleincensushousingunitsidentifiedinthehousingunitmatchingasgeocodingerrors.TheclericalmatchersassignedacodeindicatinggeocodingerrortoE-samplepersonsforwholehousehold E-samplenonmatches.Therewasnoneedforafollow-up interview,sincethehousingunitfollow-upoperationiden-tifiedthesehousingunitswithgeocodingerrors.These E-samplepeoplewereerroneouslyenumeratedinthis sampleclusterbecausetheywereenumeratedinahous-ingunitthatwasincorrectlygeocodedtothissampleclus-ter.In1990,thesepeoplewerefollowedupbecauseit wasntclearwhowasincorrectlygeocodeduntilafterthe follow-upinterview.CodingNonmatchesinLargeHouseholdsThemailreturnshortformhadacontinuationrostertocollectnamesforpersonsseventhroughtwelve.Themailreturnlongformhadarosterforthenamesofpersonsonethroughtwelve.Datawerecollectedforthefirstsix peopleinthehousehold,forbothlongandshortforms.If thelargehouseholdfollow-upwasunsuccessful,therewereonlynamesforpersonsseventhroughtwelveforthelongandshortmailreturnforms.Censusrecordswerenot createdforthepeopleinhouseholdswithonlynames,sincetheywerenotdata-defined.ThenamesontherosterswereusedtoreducetheP-samplefollow-upofnonmatchesinlargehouseholds.P-samplepeopleinlargehouseholdswhowerefoundon thelargehouseholdrosterwerenotfollowedupbecausetheywereresidentsofthehousingunitonCensusDay.Theywerestillcountedasnotmatchedtoacensusenu-meration,butafollow-upinterviewwasnotneededto establishtheirresidenceonCensusDay.TargetedExtendedSearchP-samplewholehouseholdnonmatcheswithnoaddressmatchandE-samplewholehouseholdsofnonmatchedpeopleinhousingunitscodedasgeocodingerrorshad theirsearchareaexpandedintothefirstringofsurround-ingblocks.Theexpandedsearchisreferredtoastargetedextendedsearch(TES).SeeChapter5forafulldiscussion.Thetargetedextendedsearchfor2000A.C.E.wasatwo-stageprocess.First,clusterswereidentifiedthatwouldbenefitmostfromexpandingthesearchareatosurround-ingblocks.Second,blockswithinthesurroundingblocks weretargetedforsearching.Thisextendedsearchwastargetedattheclustersmostlikelytobenefitfromexpandingthesearcharea.Theclus-tersselectedfortargetedextendedsearchforthe2000AccuracyandCoverageEvaluationwere:*Clustersincludedwithcertainty*RelistedclustersinA.C.E.*The5percentofclustershavingthemostunweightedcensusgeocodingerrorsandA.C.E.
addressnonmatches4-12SectionIChapter4A.C.E.FieldandProcessingActivitiesU.S.CensusBureau,Census2000
- The5percentofclustershavingthemostweightedcensusgeocodingerrorsandA.C.E.addressnon-
matches*ClustersselectedatrandomfromtheclusterswithA.C.E.housingunitnonmatches(i.e.,A.C.E.housingunitscodedCIorUI)orcensushousingunitsidentified asgeocodingerrors(i.e.,codedGE)Theclustersnotselectedfortargetedextendedsearchwere:*ClustersnotselectedfromtheclusterswithA.C.E.hous-ingunitnonmatches(i.e.,A.C.E.housingunitscodedCIorUI)andcensushousingunitsidentifiedasgeocodingerrors(i.e.,codedGE),(i.e.TESeligibleforsampling,but notselected)*ClusterswithnoA.C.E.housingunitnonmatchesorcen-susgeocodingerrorsidentifiedinthehousingunit matching.*List/EnumerateclustersTable4-9containsthenumberofclustersselectedforTESandthenumberofP-sampleandE-samplepeopleinTES.Thenumberofclustersincludestheclustersincludedwith certaintybecausetheywererelisted.P-samplepeoplewitharesidenceprobabilityofzerohavebeenexcludedfromthetable.Table4-9.TheTESSample Clusters P-sample people E-sample peopleIncludedwithcertainty
...........1,15028,53320,572SampledforTES
...............1,0893,8892,281Total...........................2,23932,42222,853ClusterswiththemostunweightedandweightedcensusgeocodingerrorsandA.C.E.addressnonmatcheswere includedbecausesomeclusterswithlargeweightscon-tributedisproportionatelytotheestimates.Approximately10percentoftheclustersoftheremainingclusterswith A.C.E.housingunitnonmatchesandcensusgeocodingerrors(49percentofallclusters)wereselectedatrandom.Therewere2,239clustersselectedfortargetedextended search.Inthesecondstageoftargeting,theworkwastargetedtoblockswithinthesearchareawherethegeocodingerrorwaslocated.In1990,theeffortrequiredtosearchfor matchesandduplicatesinlargeareasthathadonlyafew possiblematchesorduplicatesappearedtoleadtoerrors.Therewasanecdotalevidenceofclerkswhodidnotbothertolookinsurroundingblocksbecausetheyrarely foundanything.Targetingtheexpandedsearchingprob-ablyreducedclericalerrors,aswellasthecostofthe operation.P-SampleMatchingExtendedSearchThesearchareawasexpandedtoclericallysearchtheringofsurroundingblocksfortheP-samplewholehousehold nonmatches,whenahousingunitwasnotamatchin housingunitmatching,(i.e.,thehousingunitmatchcode wasanonmatchorunresolved).Therewasnosearchingin surroundingblocksforpartialhouseholdnonmatchesor forwholehouseholdnonmatcheswithmatching addresses.Howthesearchwasdonedependedonwhethertheclus-teranditssurroundingblocksconsistedsolelyofurbantypeaddresses,orwhethertheyconsistedofsomeorall ruraltypeaddresses.*Inareasthatarecompletelyurban,iftheclerklocatedthebasicstreetaddressinthesurroundingblocksortheclerkdeterminedtherangeofaddresseswasinthe surroundingblocks,personmatchingwasconductedin thatblockwherethebasicstreetaddressorrangewaslocated.Thematchingwasalsoconductedwhentherewasapossibleaddressmatchinasurroundingblock.*Inruralormixedurbanandruralareas,becauseofthedifficultiesinmatchingruraltypeaddresses,therewasnoattempttomatchaddressesinthesurroundingblocks.Instead,peopleweresearchedforinallofthe surroundingblocks.E-SampleExtendedSearchforGeocodingErrorsAcensuspersoninahousingunitthatwascodedasageocodingerrorwasanerroneousenumerationunlessthe housingunitwaslocatedinsidetheexpandedsearcharea.Thecensusgeocodingerrorswereidentifiedinthehous-ingunitphaseoftheA.C.E.Anotherinterviewidentified thehousingunitsthatphysicallyexistedinthesurround-ingblocks,insteadofwithintheclusterwheretheywereenumerated.Thisfieldworkwasdoneforwholehouse-holdE-samplenonmatchesinhousingunitsidentifieddur-ingthehousingunitphaseasgeocodingerrors.ThisfieldvisitwasconductedataboutthesametimeastheA.C.E.personinterview.Thepeopleinthesehousingunitswerecodedasfollows:
- Ifthehousingunitwasfoundtoexistinthesurround-ingblocks,theclerkscodedtheE-samplepersonas geocodedtothesurroundingblocksduringthebeforefollow-uppersonmatching.*Ifthehousingunitexistedinthesamplecluster,theE-samplepersonwascodedasnotageocodingerror, becausethathousingunitdidexistinthesamplecluster.*Ifthehousingunitdidnotexistinthesurroundingblocksorcouldnotbelocatedonthemapsentwiththe case,theE-samplepersonwascodedasageocodingerror,indicatingthepersonwaserroneouslyenumer-atedbecausethehousingunitwasincorrectlygeocoded intheblockcluster.SectionIChapter44-13A.C.E.FieldandProcessingActivitiesU.S.CensusBureau,Census2000
- Ifthefieldworkwasnotdoneorifitcouldnotbedeter-minediftheblocknumberenteredontheformwasin theblockclusterorinthesurroundingblocks,theunre-solvedcodewasused.Therewasnofollow-upforthe unresolvedcases.Apersonfollow-upinterviewfortheE-samplenonmatchescodedinthesampleclusterorinthesurroundingblockswasneededtoidentifyotherreasonsforerroneousenu-meration,suchasfictitiouspeopleandotherresidenceswherepeopleshouldhavebeencountedonCensusDay.E-SampleTargetedDuplicateSearchAsearchforduplicatedpeoplewasconductedclericallyinthetargetedextendedsearchclusters,whenthehousingunitwasidentifiedduringthefieldinterviewasphysically existinginthesurroundingblocks.LiketheP-samplesearchformissedunits,theduplicatesearchwascreatedtoidentifypeoplewhowereduplicatedbecauseofgeo-codingerror.Therewasnosearchingforduplicatesinthegroupquartersenumerations.IfanE-samplehousingunitwasidentifiedasexistinginthesurroundingblocks,ahousingunitduplicatesearchwasconducted.Howthiswasdonedependedonwhethertheclusteranditssurroundingblocksconsistedsolelyof urbanstyleaddressesorwhethertheyweresomeorallruralstyleaddresses.*Inurbanareas,thisduplicatesearchwasdonefirstonhousingunitsandthenonpeople.First,theclerkssearchedintheblockwherethehousingunitshouldhavebeencountedintheringofsurroundingblocks.If thehousingunitwasduplicated,asearchwascon-ductedtoidentifyduplicatedpeople.Theduplicatesearchwasconductedonlyintheblockwherethedupli-catedhousingunitwaslocated.Thesepeoplewere duplicatedbecausethehousingunitwasenumeratedcorrectlyinasurroundingblockandincorrectlyinthesamplecluster.Ifthehousingunitwasnotduplicated,a searchforpersonduplicationwasnotconducted.Thesearchconcentratedonpeoplewhowereduplicatedandwereinduplicatedhousingunitscausedbyhousing unitgeocodingerrorinthesurroundingblocks.*Theduplicatesearchinruralormixedareaswasasearchthroughouttheentiresearchareaforperson
duplicates.AddedandDeletedCensusHousingUnitsCensuscoverageoperationscontinuedpastthecreationoftheJanuary,2000DMAF.Asaresult,anaddedcensushousingunitisonethatwasnotintheinitialhousingunitmatching,becauseitwasaddedtotheinventoryofcensus housingunitsaftertheJanuary,2000DMAFwascreated.A deletedcensushousingunitisonethatwasintheJanuary,2000DMAF,butwasremovedfromtheclusterbeforethefinalinventoryofhousingunitswascreated.ThetargetedextendedsearchwasbasedontheA.C.E.housingunitmatchingtotheJanuary,2000DMAFanddid notcovercensushousingunitsaddedtotheblockcluster sincehousingunitmatching,thusexcludinganygeocod-ingerrorsthatwerenotrecognizedintimetoconductthe TESfieldfollow-up.Ifaclusterwasnotidentifiedfortar-getedextendedsearchandalargebuildingwasaddedto thecluster,thefirsttimeitcouldhavecometoouratten-tionwasduringpersonmatchingandanyaddedhousing unitswouldbeidentifiedasgeocodingerrorsduringthe personfollow-up.Ifanyofthesecasesshouldhavebeen includedinthetargetedextendedsearchandwereincor-rectlygeocoded,anotherfollow-upoperationwouldhave beenneededtoidentifytheonesthatactuallyexistedin thesurroundingblocksandthosethatexistedoutsidethe expandedsearcharea.Therewasnotsufficienttimetoconductanotherinterviewtodeterminewhichaddedcensushousingunitswith geocodingerrorreallyexistedinthefirstringofsurround-ingblocks.Thesecaseswerehandledintwoways:*InTESclustersandclusterseligibleforTESsampling,thepeopleinaddedhousingunitswherepersonfollow-upidentifiedgeocodingerrorweretreatedas unresolvedandtheprobabilityofcorrectenumeration wasimputed.Thesenewunresolvedcasesweretreatedthesameasanyotherpersoncodedwithunresolvedgeography.*WhenthehousingunitwasnotinaTEScluster,thepeopleremainedcodedasgeocodingerrorsandwereerroneousenumerations.Asimilarlimitationexistedwhenahousingunitthatwasmatchedinthehousingunitmatchingwaslaterdeleted.
Therewasaconcernthatthedeletedunitmayhavebeenmovedtoasurroundingblock.Clusters,wherematchedhousingunitsintheDMAFthatweredeletedfromthe HCUF,hadnochanceofbeingTESclusters,iftheclusterhadnoA.C.E.housingunitnonmatchesorcensusgeo-codingerrors.Thesedeletedcaseswerealsotreateddifferentlydepend-ingonwhethertheywereinTESclusters:*IfinaTEScluster,theywereidentifiedasTESpeopleandasurroundingblocksearchwasconductedforthehousingunitsintheTESP-samplematching.*IfthehousingunitwasnotinaTEScluster,therewasnosurroundingblockmatching.Surroundingblock matchingcouldnotbedonebecausetherewerenosur-roundingblockpeopleinnon-TESclusters.BeforeFollow-UpResultsTables4-10and4-11containtheresultsofbeforefollow-upmatchingforthePsampleandtheEsample.Fordetailsofthesecodes,seeChilders(2001).Thesebefore4-14SectionIChapter4A.C.E.FieldandProcessingActivitiesU.S.CensusBureau,Census2000 follow-upmatchingresultsarefromunweighteddatafromthefiftystatesandtheDistrictofColumbia.TheP-samplecodesaregroupedinto:*Matched*Notmatched*Possiblematch*Unresolvedmatchstatus*RemovedfromthePsample Matched.TheP-samplepersonwasfoundinthecensus.Notmatched.TheP-samplepersonwasnotfoundinthecensus.Afollow-upinterviewwasconductedfor:*Partialhouseholdnonmatches
- Wholehouseholdsofconflictinghouseholdmembers(i.e.,wholehouseholdsofP-sampleandcensusnon-
matches)2*OtherwholehouseholdnonmatcheswheretheP-sampleinterviewwasconductedwithanonhouseholdmember 3Possiblematch.TheP-samplepersonmayhavebeenamatchtothecensusperson.Afollow-upinterviewwas neededtodetermineifthetwonamesreferredtothesameperson.Unresolvedmatchstatus.Theonlycategoryofunre-solvedinthebeforefollow-upmatchingwasinsufficientinformationformatchingandfollow-up.RemovedfromthePsample.TheonlycategoryofremovedfromthePsampleinthebeforefollow-upmatch-ingweretheP-samplepeoplecodedasduplicates.TheE-samplecodesaregroupedinto:
- Correctlyenumerated*Erroneouslyenumerated*Nonmatch
- Possiblematch*UnresolvedCorrectlyenumerated.Thecorrectlyenumeratedpeopleinbeforefollow-upmatchingweretheonesmatch-ingthePsample.Erroneouslyenumerated.Thecategoriesduringbeforefollow-upwerefictitiouspeople,duplicates,insufficientinformationformatchingandfollow-up,andgeocoding errors.*ThefictitiouspeoplewerethosewherenotesonthecensusimageidentifiedthepersonasonewhodiedbeforeorwasbornafterCensusDay,orasnotarealpersonsuchasadogorotherpet.*TheE-samplepeopleenumeratedmorethanoncewerecodedasduplicates.*TheE-samplepeoplewithinsufficientinformationformatchingandfollow-upwerethosewhoweredata-defined,butdidnotcontainfullnameandatleasttwo characteristics.
4*Censuspeopleinhousingunitsidentifiedasgeocodingerrors 5duringtheinitialhousingunitfollow-upwerecodedaserroneouslyenumeratedbecauseofgeocoding error.Nonmatch.AllE-samplepeoplewhodidnotmatchtothePsampleweresentforafollow-upinterview.Possiblematch.E-samplepeoplewhowerecodedaspossiblematcheswerefolloweduptodeterminewhethertheywere,infact,matches.
Unresolved.Inbeforefollow-upmatching,theunre-solvedcategoryonlyincludesthecensushousingunitsthatneededtargetedextendedsearchfieldworkandthatfieldworkwasnotdone.Table4-10.PSampleBeforeFollow-Up MatchingP-samplematchstatus UnweightedpeoplePercent Matched..............................573,50685.7Notmatched
..........................76,80411.5Possiblematch
........................5,0700.8 Unresolved
...........................7,5241.1 Removed.............................5,9230.9Total.................................668,827100.0 2ThesecaseshavebeencalledtheSmith/Jonescasesinthe past.3Nofollow-upinterviewwasconductedwhentherewerewholehouseholdsofP-samplenonmatchesfrominterviewswith householdmembersinahousingunitthatdidnotmatchinthe housingunitoperationormatchedtoahousingunitcontainingnodata-definedpeople.
4Thisisthesamerulethatwasusedinthe1990PES.Theremusthavebeenenoughinformationaboutthepersontohavea chanceatlocatingthepersonforafollow-upinterviewbeforethe personwasallowedintothematchingprocess.SeeChilders
(2001).5Ageocodingerrorisanerrorinassigningthehousingunittothecorrectlocation.SectionIChapter44-15A.C.E.FieldandProcessingActivitiesU.S.CensusBureau,Census2000 Table4-11.E-sampleBeforeFollow-Up MatchingE-sampleenumerationstatus UnweightedpeoplePercentCorrectlyenumerated
..................544,99576.4Erroneouslyenumerated.
..............27,9343.9Notmatched
..........................134,91618.9Possiblematch...
.....................4,7510.7 Unresolved...........................3040.0Total.................................712,900100.0Note:Percentagesintablemaynotaddtototalduetorounding.A.C.E.PersonFollow-UpThepersonfollow-upwasconductedtogatheradditionalinformationtoaccuratelycodetheresidencestatusofthe nonmatchedP-samplepeopleandtheenumerationstatusoftheE-samplepeople.Inaddition,thematchstatusofthepossiblematcheswasresolvedduringthefollow-up interview.Thefollowingcasesweresenttopersonfollow-
up:*P-samplepartialhouseholdnonmatches*P-samplewholehouseholdnonmatcheswherethecen-susenumerateddifferentE-samplepeople(i.e.,conflict-inghouseholdsorSmith/Jonescases)*P-samplewholehouseholdnonmatcheswheretheA.C.E.personinterviewwaswithaproxyrespondent*E-samplenonmatches*PossiblematchesbetweenthePsampleandthecensus*P-samplematchesandnonmatcheswithunresolvedresidencestatus*P-samplenonmatchesneedingadditionalgeographic work 6Theresultsofthefollow-upinterviewwererecordedinthe matchingsoftwarebythematchingclerks.Table4-12con-tainstheresultsofthefollow-upcodingfortheP-samplepeoplewhowerefollowedup.TheP-samplepeoplewhowerefollowedupwereclericallyclassifiedas:*Matched*NonmatchedresidentoftheclusteronCensusDay
- Unresolvedresidenceormatchstatus
- NonresidentoftheclusteronCensusDay Matched.TheP-samplepersonwasfoundinthecensusintheblockclusterorinasurroundingblockafterthe follow-upinterview.NonmatchedresidentoftheclusteronCensusDay.TheP-samplenonmatchwasnotfoundinthecensus,andthefollow-upinterviewdeterminedheorshe shouldhavebeencountedinthesearchareaforthiscluster.Unresolvedresidenceormatchstatus.Thepersonhadunresolvedresidencestatus,becausethefollow-upinterviewdidnotsuccessfullycollecttheinformationrequiredtoaccuratelyidentifythispersonasaresidentof theclusteronCensusDay.Inthecaseofpossiblematches,thefollow-upinterviewwasnotabletoascertainthematchstatusofthepeople.NonresidentoftheclusteronCensusDay.
TheP-samplepersonwasnotaresidentofthehousinguniton CensusDayandwasremovedfromthePsample.These peoplewereduplicates,fictitious,livinginaP-samplehousingunitthatwaslistedintheclusterinerror(i.e.,P-samplegeocodingerror),ortheP-samplepersonshould havebeencountedatanotherresidenceonCensusDay.Theresultsofthefollow-upinterviewinTable4-12indi-cate14.7percentunresolvedand12.5percentremoved fromthePsample.Table4-12.ResultsofP-sampleFollow-Up InterviewAfterfollow-upmatchcode UnweightedpeoplePercent Matched..............................9,79319.4Nonmatchedresident
..................26,96153.4 Unresolved
...........................7,45114.7 Nonresident
..........................6,29612.5Total.................................50,501100.0Table4-13containstheresultsoftheE-samplefollow-upinterviews.Thefollowed-upE-samplepeoplewereclassi-fiedas:*Matched*Correctlyenumerated*Erroneouslyenumerated*Unresolved Matched.TheP-sampleandE-sampleenumerationsrefertothesameperson.Thematchwasmadeafterthefollow-upinterview.Correctlyenumerated.TheE-samplenonmatchwasidentifiedduringthefollow-upinterviewascorrectlyenu-meratedinthecensus.
6Housingunitsinrelistandlist/enumerateclustersdidnothavehousingunitmatching.Therefore,P-samplegeocoding errorsinsuchclustersneededtobeidentifiedduringperson matching.Inaddition,whentheinterviewerchangedtheaddressintheCAPIinstrument,theP-samplegeographywascheckedtomakesuretheinterviewerdidnotinterviewoutsidethesample cluster.4-16SectionIChapter4A.C.E.FieldandProcessingActivitiesU.S.CensusBureau,Census2000 Erroneouslyenumerated.TheE-samplenonmatchwasidentifiedduringthefollow-upinterviewaserroneously enumeratedinthecensus,becausethepersonshould havebeencountedatanotherresidenceonCensusDay, wasfictitious,hadinsufficientinformationformatching andfollow-up,wasduplicated,orlivedinahouseholdthat wasageocodingerror.
Unresolved.Thefollow-upinterviewforthecensusnon-matchwasnotsuccessful.TheresultsoftheE-samplefollow-upinTable4-13indi-cate7.4percentoftheE-samplepeoplefollowedupwere erroneouslyenumeratedand14.1percentwereunre-solved.Table4-13.ResultsofE-sampleFollow-UpforNonmatchesandPossibleMatchesAfterfollow-upmatchcode UnweightedpeoplePercentMatched...
...........................9,0886.3Correctlyenumerated
..................103,58972.2Erroneouslyenumerated.
..............10,6187.4 Unresolved
...........................20,18514.1Total.................................143,480100.0AfterFollow-UpCodingAfterthefollow-upwascompleted,theresultsoftheinter-viewswerereviewedandcodesenteredintothesystembythematchingclerks.SeeAttachments5,6,and7for definitionsoftheindividualmatch,enumeration,andresi-dencestatuscodesassignedbythematchingclerks.ThefinalP-sampleresultsareshowninTables4-14and4-15.TheP-samplepeoplehavebeenclassifiedas matched,notmatched,unresolvedmatchstatus,andremovedinTable4-14andalsotabulatedasresident,non-resident,andunresolvedresidencestatusinTable4-15.
Thedataareunweighted,butthepeoplesampledoutof thetargetedextendedsearchareremovedfromtabula-tionsforthissection.TheP-samplematchstatusisdefinedas:*Matched*Notmatched
- Unresolvedmatchstatus
- RemovedfromthePsample Matched.TheP-samplepersonwasfoundintheclusterorinthesurroundingblockineitherahousingunitorin groupquarters.Notmatched.TheP-samplepersonwasnotfoundinthesearcharea.Ifthenonmatchwassenttofollow-up,thepersonwasconfirmedtobearesidentoftheclusteronCensusDay.Ifthenonmatchwasnotsentforafollow-upinterview,ahouseholdmemberidentifiedthepersonasa residentofthehousingunitduringtheoriginalA.C.E.
interview.Unresolvedmatchstatus.Thematchstatuswasunre-solvedforpossiblematcheswithunsuccessfulfollow-upinterviewsandforP-samplepeoplewithinsufficientinfor-mationformatchingandfollow-up.RemovedfromthePsample.PeoplewereremovedfromthePsamplewhentheywerefictitious,duplicates,geocodingerrors,ornotresidentsofthehousingunitonCensusDay.Table4-14.P-sampleMatchStatusAfter Follow-UpP-sampleafterfollow-upmatchstatus UnweightedpeoplePercent Matched..............................578,69588.6Notmatched
..........................54,4248.3 Unresolved
...........................7,8261.2 Removed.............................12,3931.9Total.................................653,338100.0TheP-sampleresidencestatuswasdefinedas:*Resident
- Nonresident*Unresolvedresidencestatus Resident.TheP-samplematchedornotmatchedpersonwasaresidentofthehousingunitonCensusDay.
Nonresident.P-samplepeoplewerenonresidentsoftheclusterwhentheywerefictitious,duplicates,geocodingerrors,orshouldnothavebeenincludedasaresidentof thehousingunitonCensusDay.Nonresidentswere removedfromthePsample.Unresolvedresidencestatus.AmatchedornotmatchedP-samplepersonhadunresolvedresidencestatus whenthefollow-upinterviewdidnotsuccessfullydeter-minethepersonsresidenceonCensusDay.Theresidencestatusofthepossiblematchwasunresolvedwhenthe follow-upinterviewwasnotsuccessful.Theresidencesta-tuswasalsounresolvedwhentheP-samplepersonhadinsufficientinformationformatching.Table4-15.P-sampleResidenceStatusAfter Follow-UpP-sampleafterfollow-upresidencestatus UnweightedpeoplePercentResident.............................625,86395.8 Nonresident
..........................12,3931.9 Unresolved
...........................15,0822.3Total.................................653,338100.0SectionIChapter44-17A.C.E.FieldandProcessingActivitiesU.S.CensusBureau,Census2000 ThefinalE-sampleresultsareinTable4-16.TheE-samplepeoplewereclassifiedascorrectlyorerroneouslyenumer-atedorhavinganenumerationstatusofunresolved.
Theseweretheunweightedmatchresultsthatgotoimpu-tationandestimationwiththepeoplesampledoutofthe targetedextendedsearchremoved.TheE-sampleenumerationstatuswasdefinedas:*Correctlyenumerated
- Erroneouslyenumerated*UnresolvedenumerationstatusCorrectlyenumerated.E-samplepeoplewerecorrectlyenumeratedwhentheywerematchedtothePsample,orwhentheyhavebeenfollowedupandtheyshouldhave beenenumeratedinthiscluster.Erroneouslyenumerated.E-samplepeoplewereerro-neouslyenumeratedwhentheyhaveanotherresidence wheretheyshouldhavebeencountedonCensusDay,werefictitious,wereduplicated,livedinahousingunitthatwasageocodingerror,orhadinsufficientinformation formatchingandfollow-up.Unresolvedenumerationstatus.E-samplepeoplehadunresolvedenumerationstatuswhenthefollow-upinter-viewwasunsuccessful.TheE-samplepersonmayhavebeenfolloweduptoobtaininformationabouttheE-samplenonmatch,possiblematch,matchedpersonwith unresolvedresidencestatus,orgeographicworktoobtainthelocationofthehousingunit.Table4-16.E-sampleMatchingAfterFollow-UpE-sampleenumerationstatus UnweightedpeoplePercentCorrectlyenumerated
..................652,39092.6Erroneouslyenumerated.
..............31,0644.4 Unresolved
...........................21,1483.0Total.................................704,602100.0TherewereunresolvedcodesassignedtoP-sampleandE-samplepeople.AprobabilityofbeingmatchedwasimputedforaP-samplepersonwithunresolvedmatchsta-tus.AprobabilitythattheP-samplepersonwasaresident wasimputedwhenthefollow-updidnotgiveenoughinformationtoresolvethepersonsresidencestatus.TheprobabilitythataP-samplepersonwasaresidentwasthe probabilitythatthepersonshouldhavebeenincludedin theP-sample.TheprobabilitythattheE-samplepersonwascorrectlyenumeratedwasalsoimputedfortheE-samplepeoplewithunresolvedenumerationstatus.AP-samplepersoncouldbematched,buthaveunresolvedresidencestatusorhavebothmatchandresidencestatus unresolved.Therefore,tabulationsformatchstatusand residencestatusareshownseparatelyfortheP-sample.EstimationOutcomeCodesTwosetsofoutcomecodeswereprepared,onefortheCensusDayhouseholdandonefortheInterviewDay household.ThefinalP-sampleestimationoutcomecodeidentifiedthestatusoftheinterviewforestimationonCensusDayandonthedayoftheinterview.Forexample, therewerecasesthatwerecompleteinterviewsforthecurrentresidents,butwerereportedasnonintervieworvacantfortheCensusDayresidents.ThefinalCensusDayoutcomecodesareinTable4-17.Outcomecodeswerechangedasaresultofthefollow-up interviewinthefollowingtypesofsituations:
- NoCensusDayresidentsnoninterview.
WholehouseholdsofP-samplepeoplewhosaidtheylivedelse-whereonCensusDaywereconvertedtononinterviews.
- NoCensusDayresidentsvacant.WholehouseholdswholivedingroupquartersonCensusDayorshouldhavebeenenumeratedatanotherresidencewerecon-vertedtovacant.Theoutcomecodesforthesetwosituationswerechangedbecausenewinformationfromthefollow-upinterviewindicatedtheoriginalinterviewwasincorrect.Thehousingunitoutcomecodeforpeopleidentifiedasresidentsofthe housingunitfromthepersoninterviewwhosaidinthe follow-upinterviewthattheylivedelsewherewaschangedtononinterview.Theoriginalpersoninterviewlistedthishouseholdasresidentsofthehousingunitwhentheydid notliveatthisaddress.Theinterviewisincorrectandisconvertedtoanoninterview.Thehousingunitoutcomecodesforpeopleidentifiedasresidentsofthehousingunit,fromthepersoninterviewwhosaidinthefollow-upinterviewthattheylivedin groupquartersorshouldhavebeenenumeratedatanotherresidence,werechangedtovacant.Theoriginalpersoninterviewshouldhaveclassifiedthehousingunit asvacant,becausethepeopleshouldhavebeenenumer-atedatanotheraddress.Thetablealsocontainsnumbersofhousingunitsidenti-fiedasinterviews,noninterviews,andvacantandpercent-agesoftotalhousingunitsandnumbersandpercentagesofoccupiedhousingunits.Thenoninterviewrateforoccu-piedhousingunitsforCensusDaywas3.0percent.
AddressesthatwerenothousingunitsonCensusDaywereremovedfromthePsample.4-18SectionIChapter4A.C.E.FieldandProcessingActivitiesU.S.CensusBureau,Census2000 Table4-17.FinalCensusDayEstimationOutcomeCodesforA.C.E.HousingUnits(Unweighted)CensusDayoutcomecodeTotalhousingunitsOccupiedhousingunitsNumberPercentNumberPercentCensusDayinterview
................................................254,17587.5254,17597.0CompleteCensusDayinterviewwithahouseholdmember
..............
233,327CompleteCensusDayinterviewwithaproxyrespondent..
.............
18,335Sufficientpartialinterview
...........................................
2,513CensusDaynoninterview.
............................................7,7942.77,7943.0NoCensusDayresidents
...........................................
2,709FieldCensusDaynoninterview
......................................
2,667Allpeoplehaveinsufficientinformationformatchingandfollow-up
.......2,418TotaloccupiedCensusDayhousingunits
..............................261,969100.0Vacant.............................................................28,4729.8NoCensusDayresidents
...........................................
4,561VacantonCensusDay
.............................................23,911TotaloccupiedandvacanthousingunitsonCensusDay
.................290,441100.0NotahousingunitonCensusDay
.....................................
10,472Totalhousingunits
...................................................
300,913TheCensusDaynoninterviewratesinTables4-18and4-19areforoccupiedhousingunits.Thepercentnoninter-viewwascalculatedfortheunweightednumbersofCen-susDaynoninterviewsdividedbytheoccupiedCensus Dayinterviews,whichwastheinterviewsplusthenonin-terviewsonCensusDay.TheCensusDaynoninterviewrateswererecalculatedtoreflectchangesduetocodinginafterfollow-upmatching.Table4-18.P-sampleNoninterviewRatesforCensusDayinOccupiedHousing UnitsbyInterviewModeInterviewmode Percent noninterviewTelephone......................................1.1 Personal.......................................3.7Qualityassurance
...............................
37.4Total...........................................3.0Table4-19.P-sampleNoninterviewRatesforCensusDayinOccupiedHousing UnitsbyTypeofInterviewTypeofinterview Percent noninterviewInterviewwithahouseholdmember...............1.8Proxyinterview
.................................
17.4Total...........................................3.0ComparisonofInitialandFinalP-SampleEstimationOutcomeCodesforCensusDayTable4-20comparesthepreliminaryandfinalCensusDayinterviewoutcomecodes.ThepreliminaryCensusDay outcomecodeswerechanged,whenthefollow-upinter-viewsfortheP-sampleclassifiedpeopleasnonresidents becausetheydidnotliveatthesampleaddressatthe timeofthecensus,ortheywereconsideredaslivingat thesampleaddressbutshouldhavebeencountedat anotherresidencesuchasgroupquartersoranother home.Thehousingunitcouldalsobeidentifiedasnot beingahousingunitonCensusDay.SectionIChapter44-19A.C.E.FieldandProcessingActivitiesU.S.CensusBureau,Census2000 Table4-20.ComparisonofthePreliminaryandFinalCensusDayOutcomeCodesPreliminaryCensusDayoutcomecodesFinalCensusDayoutcomecodes Interview with household member Inter-view with proxy Partial inter-viewNoCensus Day residents-noninterview Field noninterview Whole householdinsufficient informationNoCensus Day residents-vacantVacant Not a housing unitInterviewwithHouseholdmember
....233,327002,033001250147Interviewwithproxy.................018,3350676002520117Partialinterview....................002,5130097002Fieldnoninterview..................00002,6670000Wholehouseholdinsufficientinformation........................000002,321000NoCensusDayresidents-vacant.....00 00004,18400Vacant............................00 0000023,9110Notahousingunit..................00000000 10,206Table4-21.FinalInterviewDayEstimationOutcomeCodesforA.C.E.HousingUnits(Unweighted)InterviewDayoutcomecodeTotalhousingunitsOccupiedhousingunitsNumberPercentNumberPercentInterviewDayinterview
.....................................................264,10389.0264,10398.9CompleteinterviewonInterviewDaywithahouseholdmember.
..............
249,854CompleteinterviewonInterviewDaywithaproxyrespondent
................
12,317Sufficientpartialinterview
.................................................
1,932InterviewDaynoninterview
.................................................3,0521.03,0521.1NoInterviewDayresidents-householdconvertedtononinterview..............483FieldnoninterviewonInterviewDay........................................373Allpeoplehaveinsufficientinformationformatchingandfollow-up
.............
2,196TotaloccupiedhousingunitsonInterviewDay
.................................267,155100.0VacantonInterviewDay.
...................................................29,66210.0TotaloccupiedandvacanthousingunitsonInterviewDay
......................296,817100.0NotahousingunitonInterviewDay..
.......................................
4,096Totalhousingunits
.........................................................
300,913FinalP-SampleEstimationOutcomeCodesforInterviewDayThefinalInterviewDayoutcomecodesareinTable4-21.Theinterviewoutcome,asofInterviewDay,wasforcasesoriginallyclassifiedasnonmoversandinmovers.Changesasaresultofthefollow-upinterviewwerefromwhole householdsofnonmoverswhosaidthey:*Neverlivedatthisresidence*LivedingroupquartersonCensusDay
- LivedatanotherresidenceonCensusDayTheoutcomecodesforthesecaseswereconvertedto noninterviews.TheInterviewDaynoninterviewrateswererecalculatedtoreflectchangesduetocodinginafterfollow-upmatching.
ThefinalnoninterviewratesforInterviewDaybyinter-viewmodeandtypeofinterviewareinTables4-22and 4-23.Table4-22.P-sampleNoninterviewRatesforInterviewDayinOccupiedHousing UnitsbyInterviewModeInterviewmode Percent noninterviewTelephone.................................0.7 Personal..................................1.0Qualityassurance
..........................
15.4Total......................................1.1Table4-23.P-sampleNoninterviewRatesforInterviewDayinOccupiedHousing UnitsbyTypeofInterviewTypeofinterview Percent noninterviewInterviewwithahouseholdmember..........0.5Proxyinterview............................8.6Total......................................1.14-20SectionIChapter4A.C.E.FieldandProcessingActivitiesU.S.CensusBureau,Census2000 Attachment1.A.C.E.MoverandResidenceStatusCode
- A.C.E.MoverCode1=Nonmover2=Inmover3=Outmover
- A.C.E.BornSinceCensusDayCode0orblank=Defaultforinmovers1=BornonorbeforeCensusDay 2=BornsinceCensusDayD=DontknowR=Refused*A.C.E.GroupQuartersCode0orblank=Defaultforwholehouseholdinmovers 71=IngroupquartersonCensusDay2=NotingroupquartersonCensusDayD=Dontknow R=Refused*A.C.E.OtherResidenceCode0orblank=Defaultforwholehouseholdinmovers1=InotherresidenceonCensusDay2=NotinotherresidenceonCensusDayD=Dontknow R=Refused*A.C.E.StatusN=Nonmover,residentonCensusDayO=Outmover,residentonCensusDayI=Inmover,nonresidentonCensusDayR=Removed,nonresidentonCensusDay U=Unresolvedresidencestatus B=BornsinceCensusDay,nonresidentonCensus Day 7Partialhouseholdinmoverswereassignedthecodesof1,2,D,orRduringtheeditforCAPIdatareview.SectionIChapter44-21A.C.E.FieldandProcessingActivitiesU.S.CensusBureau,Census2000 Attachment2.TheTreatmentofGroupQuartersinA.C.E.TheA.C.E.wasdesignedtoprovideestimatesofpersoncoverageinhousingunits.Therewasnosampleof,and noestimatesfor,personsingroupquarters.TheP-sample housingunitswereselectedfortheA.C.E.andthepeopleintheP-samplehousingunitswerematchedtothepeopleenumeratedincensushousingunits.Classifyingastructureasgroupquarterswasdifficultattimes.Forexample,homesfortheelderlyhavemadeitmorecommonforasinglestructuretocontainapartments forretiredpeople,assistedliving,andfullcare.Anotherexamplewascollegedormitories.Adormitorywasgroupquarterswhenitwasoccupiedbyunmarriedstudents.Thedormitorycontainedhousingunitsifitwasoccupiedbymarriedstudents.Ifthedormitorywasmixedwithmar-ried,unmarried,faculty,andstaff,itcontainedhousingunits.Asaresult,housingunitsorgroupquarterscould bemisclassified,whentheywerenoteasilyclassifiedashousingunitsorgroupquarters.ThismisclassificationcouldbefoundinboththeA.C.E.andthecensus.WhentheP-samplepeopleinA.C.E.housingunitsdidnotmatchtopeopleenumeratedinhousingunitsinthecen-sus,theywerematchedtopeopleenumeratedinthecen-susingroupquarters.Thatis,groupquartersweresearchedforP-samplenonmatches.IftheP-samplepeoplewerefoundinthegroupquartersenumerations,theywere treatedasmatched.However,noattemptwasmadetodis-coverwhetherthemisclassificationwasintheA.C.E.orthecensus.Likewise,ifacensuspersonintheE-samplewasenumer-atedinahousingunit,butthehousingunitwasmisclassi-fiedandshouldhavebeengroupquarters,thefollow-upofthecensusnonmatchobtainedinformationaboutthe residenceoftheperson.Ifitfoundthepersonshouldhavebeencountedinthisblockingroupquartersorahousingunit,thepersonwascodedascorrectlyenumeratedin A.C.E.processing.Theidealwasnottoclassifysomeoneaserroneouswhentheyreallyshouldhavebeencountedinthiscluster,butthetypeofresidencewasmisclassified.Ifastructurecontainedbothhousingunitsandgroupquarters,thepeoplewhowereenumeratedinthecensusinahousingunitwereeligibletobeintheEsample.Thefollow-upinterviewidentifiedsuchE-samplepeoplewho werenotmatchedaslivingintheclusterandhavingnootherresidence.Theywerecodedascorrectlyenumer-ated.Therewasnoduplicatesearchbetweenpeopleenu-meratedingroupquartersandhousingunits.Insummary,then,theapproachwasbalanced:*LookforP-samplepeopleingroupquarterswhentheywerenotfoundincensushousingunits.*FollowupE-samplepeopleinbothhousingunitsandgroupquartersinthecluster.Thepopulationinhousingunitswascovered,buttherewasnoestimateofcoverageingroupquarters.Ifthehousingunitwasduplicatedinthegroupquarters,thegroupquarterspeoplewerenotcountedasduplicates.
Likewise,ifagroupquarterwasmissed,therewasno determinationofundercountedinhabitants.4-22SectionIChapter4A.C.E.FieldandProcessingActivitiesU.S.CensusBureau,Census2000 Attachment3.TheA.C.E.PersonInterview 8HouseholdRosterIfthepersonlivedatthesampleaddress,theinterviewerbegantheinterviewwithaseriesofquestionstoobtainthenamesofeveryonecurrentlylivingatthesamplehous-ingunit.Thefirstquestionwas:Ineedtogetalistofeveryonelivingherepermanentlyorstayingtemporarilyatthisaddress.Whatisyour name?Afterobtainingthenameofthepersonwithwhomtheinterviewerwasspeaking,theinterviewerasked,Anyone else?Iftherewasayesresponse,theinterviewerasked,Whatishisorhername?andfollowedthatwithAnyoneelse?untiltheinterviewerreceivedanoresponse.Asacheckfortypesofpeoplewhowerefrequentlyleftofflistingsofhouseholdmembers,thereweretwoadditionalquestions.Thefirstquestionaskedaboutpeoplewhomay havelivedatthehouseholdsometimes,butnotallthe time,suchaschildreninjointcustodyorpeoplewhotrav-eledagreatdealofthetime.Thequestionwas:Arethereanyadditionalpeoplewhocurrentlyliveorstayhere,likesomeonewhostemporarilyawayorsomeonewhostayshereoffandon?Iftheresponsewasyestheinterviewerasked,Whatwashisorhername?andfollowedwithAnyoneelse?untiltheinterviewerreceivedanoresponse.Otherpersonswhowerefrequentlyomittedfromhouse-holdlistingswereroommatesorlive-inemployees.Theinterviewerasked,Isthereanyoneelselikearoommateoralive-inemployeewholiveshere?Iftheresponsewasyestheinterviewerasked,Whatishisorhername?andfollowedwithAnyoneelse?untiltheinterviewerreceivedanoresponse.Atthispointintheinterview,theinterviewerhadcollectedalistofhouseholdmembersthattherespondenthadvol-untarilymentioned,andtheinterviewerhadalsochecked fortwotypesofpersonsthatresearchhadshownwerefrequentlyleftoffhouseholdlistings.Theinterviewerthenreviewedascreenthatcontainedalistofthehouseholdmemberstherespondentreported.Theinterviewerreadthelistofnamesandaskedifthelistwascorrect.Theinterviewersaid,Ihavelisted[READSNAMESONSCREEN].Isthatcorrect?Aftertherespondenthadreviewedthenames,theinterviewercouldchangethespelling,oraddordeleteaname.
MoversWhentherespondentagreedthatthelistwascorrect,theinterviewerhandedtherespondentacalendarcontainingthemonthsofMarch,April,andMayof2000thathadCensusDayclearlymarked.Atthispointintheinterview, thegoalwastobegindeterminingwhetherthepeoplelistedascurrentresidentswerealsoresidentsofthesamplehousingunitonCensusDayandifanyoneelse shouldhavebeenincludedasaCensusDayresident.The intervieweraskedifanyofthelistedpersons(currentresi-dents)hadmovedintothesamplehousingunitafterCen-susDay.Theinterviewersaidtotherespondent:Pleaselookatthiscalendar.DidanyofthepeopleIjustlistedmoveinto<sampleaddress>afterCensusDay, April1,2000?Iftheanswerwasyes,theinterviewerasked,WhomovedinafterApril1?Anypersonmentionedwascon-sideredanonresidentofthesamplehousingunitonCen-susDay.Ifeveryoneinthehouseholdwasmentioned,thenthewholehouseholdwasconsiderednonresidentson CensusDay.TheinterviewernowhadalistofcurrentresidentswhoalsolivedatthesamplehousingunitonCensusDay.It wasnecessarytodetermineiftherewasanyonelivingatthesamplehousingunitonCensusDaywhodidnotlivetherecurrently.Theinterviewerasked,Wasthereanyone elselivingorstayinghereonApril1,2000whohasmovedout?Iftheresponsewasyes,theinterviewerasked,Whatishisorhername?andAnyoneelse?until anoresponsewasreceived.TheinterviewernowhadalistofthenamesofeveryonetherespondenthadreportedlivingatthesamplehousingunitcurrentlyandonCensusDay.Theinterviewerthenestablishedareferenceperson(relationshipswillberela-tivetothisperson)byaskingwhoownsorrentsthe houseorapartment.Theinterviewerasked,Inwhosenameisthis(house/apartment)ownedorrented?Theintervieweralsoaskedwhetherthehousingunitwas ownedorrentedbysaying,Doyouownthis(house/apartment),rentit,orliveherewithoutpaymentofrent?8SeeKeeley(2000)fordetails.SectionIChapter44-23A.C.E.FieldandProcessingActivitiesU.S.CensusBureau,Census2000 DemographicsAtthispointintheinterview,theinterviewerbegantocol-lectdemographiccharacteristicsaboutalllistedpersons tofacilitatematchingthepersonscollectedinthisinter-viewtopersonslistedonthecensusquestionnaireforthe samplehousingunit.Demographiccharacteristicsarealso usedtocreatepost-strataindualsystemestimation.See Chapter7formoredetails.Thedemographiccharacteristicscollectedintheinterviewwere: 1.Sex.Theinterviewermayhaveenteredthesexofthepersonoraskedthequestionwhenindoubt.The questionwas,Is[NAME 9]maleorfemale?
2.Age.Agewascollectedinaseriesofquestions.Theintervieweraskedfordateofbirth(Whatis[NAMES]
dateofbirth?).Whenthedateofbirthwasenteredintheinstrument,theageofthepersonwascalculatedandtheinterviewerverifiestheagebysaying,So
[NAME]wasabout[AGE]onApril1?Iftheagewas notcorrect,theinterviewerchangedthedateofbirthinthepreviousquestionandtheagewasthenrecalcu-lated.Iftherespondentdidnotknowthedateofbirth,thentheintervieweraskedthepersonsage.Theinter-viewerasked,Whatwas[NAMES]ageonApril1, 2000?3.Relationship.Relationshipwastothepersoninwhosenamethehouseorapartmentwasownedorrented(calledtheReferencePerson).Theinterviewer handedtherespondentacardcontainingrelationshipcategoriesandasked,Howis[NAME]relatedto[THEREFERENCEPERSON]?foreachperson.
4.HispanicOrigin.Hispanicoriginwascollectedinaseriesofquestions.Thefirstquestionwas,IsanyoneofSpanish,Hispanic,orLatinoorigin?Iftheresponse wasyes,theinterviewerasked,Whois?followed byIsthereanyoneelseofSpanish,Hispanic,orLatinoorigin?untiltheresponsewasno.IfanyonewasmentionedasbeingofHispanicorigin,theinterviewerasked,Is[NAME]ofMexican,Puerto Rican,Cuban,orsomeotherSpanishorigin?foreach personmentioned.
5.Race.Racewasalsocollectedinaseriesofquestions.Theinterviewerreferredtherespondenttothepartof thecardcontainingracialcategoriesandsaid,Im goingtoreadalistofracecategories.Pleasechoose oneormorecategoriesthatbestdescribe[NAMES]
race.Iftherespondentsaid,AmericanIndianorAlaskaNative,theinterviewerasked,Whatis[NAMES]enrolledorprincipaltribe(s)?Theinterviewerrecordedasmanyresponsesasgiven.Iftherespondentsaid,Asian,theinterviewerasked,TowhatAsiangroupdid[NAME]belong?Is[NAME]AsianIndian,Chinese,Filipino,Japanese,Korean,Viet-namese,or,someotherAsiangroup?Theinterviewerrecordedasmanyresponsesasgiven.Iftherespondentsaid,PacificIslander,theinter-viewersaid,TowhatPacificIslandergroupdid[NAME]belong?Is[NAME]GuamanianorChamorro,Samoan,orsomeotherPacificIslandergroup?The interviewerrecordedasmanyresponsesasgiven.Atthispoint,theinterviewerhadalistofallreportedcur-rentandCensusDayresidentsandtheirdemographic characteristicsforuseinmatchingtheseresidentstoresidentsreportedonthecensusquestionnaireforthishousingunit.ForhouseholdsthatreportedmovingintothesamplehousingunitafterCensusDay,thisinformationwasverified.Theinterviewersaidtotherespondent:So,everyoneyoumentionedtodaymovedinto<sampleaddress>afterApril1,2000.Isthat correct?Iftheinformationwascorrect,theinterviewwascontin-uedbyaskingtherespondentifheorsheknewandhad informationabouttheresidentsofthesamplehousingunitwholivedthereonCensusDay.(Thispartoftheinter-viewwasdiscussedinthesectiononmovers.)ResidenceSectionForallhouseholdsinwhichatleastonememberlivedatthesamplehousingunitonCensusDay,theinterviewer continuedwithafewquestionsthatcheckedfortwotypes ofspeciallivingsituationsthatwerepotentialsourcesofduplicateenumerations.Respondentstendedtoforgetthathouseholdmembersmayhavebeenlivingorstaying ataplaceawayfromthesamplehousingunit.Thismayhavecausedsomepersonstobereportedmorethanonce,atthesamplehousingunitandagainatotherplaces wheretheymayhavelivedorstayed.ThefirstsituationthathadthepotentialtocauseduplicateenumerationswaswhenapersonmayhavelivedataplacethatwasnotaprivatehouseholdonCensusDay.
SincetheCensusBureaudidspecialenumerationsat 9Thebracketscontainingname,age,andtheReferencePer-sonsnamewerefilledbytheinstrument.Whenspeakingtotherespondent,AreyouorotherappropriatefillersreplacedIs[NAME].4-24SectionIChapter4A.C.E.FieldandProcessingActivitiesU.S.CensusBureau,Census2000 placessuchascollegedorms,nursinghomes,prisons,andemergencyshelters,theinterviewerinquiredifany-onewasstayingatanyofthesetypesofplacesbysaying:Youranswerstothenextfewquestionshelpuscounteveryoneattherightplace.TheCensus Bureaudidaspecialcountatallplaceswheregroupsofpeoplestay.Examplesincludecollegedorms,nursinghomes,prisons,andemergencyshelters.On April1,2000,wereanyofthepeopleyoumentionedtodaystayingelsewhereatanyofthesetypesofplaces?10Iftheresponsewasyes,theinterviewerasked,Whostayedatoneofthesetypesofplaces?Thenextsituationthatcouldresultinaduplicateenu-merationwaswhenapersonmighthavehadanotherresi-dence.Theinterviewersaid:Somepeoplehavemorethanoneplacetolive.Examplesincludeasecondresidenceforwork,afriendsorrelativeshome,oravacationhome.OnApril1,2000,didanyofthepeopleyoumentioned todayhavearesidenceotherthan<sample address>?Iftheresponsewasyes,theinterviewerasked,Whohadanotherresidence?Foreachpersonmentionedashavinganotherresidence,theinterviewerasked,AsofApril1,did[NAME]spend mostofthetimeat<sampleaddress>orattheotherresi-dence?Iftheresponsewas,Idontknow,theinter-viewerasked:Whichofthefollowingcategories,mostaccuratelydescribestheamountoftime[NAME]staysattheotherresidence?Afewdaysofeachweek;entireweeksofeachmonth;monthsatatime;orsome otherperiodoftime.Iftherespondentstillwasnotsurewherethepersonspentmostofthetime,therewasaseriesofquestions designedtoassignanamountoftimespentatsomeotherresidence,suchas,Duringatypicalweek,did[NAME]spendmoredaysat<sampleaddress>oratthe otherresidence?orDuringatypicalmonth,did[NAME]
spendmoreweeksat<sampleaddress>orattheother residence?Ifthesequestionsdidnothelptherespondentdecidewherethepersonspentmostofthetime,thepersonsresidencewasdeterminedbyasking:Was[NAME]stayingat<sampleaddress>ortheotherresidenceonApril1,2000?Atthispoint,theinterviewerhadareportedlistofcurrentandCensusDayresidentsofthesamplehousingunit developedthroughanextensivehouseholdlistingproce-dure.Theinterviewerhadobtainedthedemographicchar-acteristicsofthelistedpersons.Throughquestionson mobilityandotherpossibleresidencesithadbeendeter-mined:*whethereveryonelistedinthehouseholdcurrentlyshouldbeconsideredaCensusDayresidentofthe samplehousingunit*whetheranyonecurrentlyabsentfromthehouseholdshouldbeconsideredaCensusDayresident.ConclusionofInterviewTheinterviewernowwasreadytoconcludetheinterview.Beforeconcluding,therewasonelastcheckofthehouse-holdlisting.Thefirstname,middleinitial,lastname,sex,andageofeachpersonlistedasacurrentandCensusDayresidentwasshownonthescreen.Theinterviewer,again, showedtherespondentthecomputerscreenandasked,DoIhavethespelling,sex,andagecorrectforevery-one?Ifnot,correctionscouldbemadeatthisscreenand therespondentwasaskedtoverifyand/orchangetheinformationuntiltherespondentsaidthateverythingwascorrect.Theintervieweraskedtherespondentforhis/hertele-phonenumberbysaying,Incaseweneedtocontactyouagain,mayIpleasehaveyourtelephonenumber?then thankedtherespondentandconcludedtheinterviewbysaying,Thisconcludesourinterview.TheCensusBureauthanksyouforyourparticipation.
10Aninterviewerhelpscreenwasavailablewithacompletelistofspecialenumerationplaces.SectionIChapter44-25A.C.E.FieldandProcessingActivitiesU.S.CensusBureau,Census2000 Attachment4.InsufficientInformationforMatchingandFollow-UpThecensuspersonrecordswerereviewedbothbycom-puterandclericallytoidentifypeoplewithinsufficientinformationformatchingandfollow-up.Onlypeoplewithsufficientinformationformatchingandfollow-upwere allowedtobeprocessedinthematchingandfollow-upinterviewingphasesofthepersonmatching.Thethreetypesofinsufficientinformationwere:*Thecensuspeoplewerenotdata-defined.*Thecensuspeopleweredata-defined,butcomputercodedasinsufficientinformationformatchingand follow-up.*Thecensuspeoplewerecomputercodedassufficientinformation,butconvertedclericallytoinsufficientinformationformatchingandfollow-up.Thefirsttypeofcensuspeoplewhowerenotdata-definedwerenotincludedintheEsample.Onlydata-definedpeoplewereincludedintheEsample.Thesedata-definedpeoplecreatepersonrecordsinthecensus.CensusData-DefinedThetermdata-definedwasatermthathasbeenusedinthepastattheCensusBureautomeanthatacensusper-sonrecordhasbeencreated.ThetermTotalPersonswasthetotalnumberofpeoplecountedinthecensusatacen-sushousingunit.ThetermSelectedPersonsreferredto data-definedcensuspeopleinacensushousehold.Thedifferencewaspeoplewhowerenotdata-defined.Thesepeoplehadnocensuspersonrecord.Awholeperson imputationprocedurewasemployedtocreatecharacteris-ticdatainthecensusforthesepeople.Twocharacteristicswererequiredtobedata-defined,wherenamecountsasacharacteristic.Namemusthave hadatleastthreecharactersinthefirstandlastnametogether.Othercharacteristicsthatcouldbeusedinthecountingwererelationship,sex,race,Hispanicorigin,andeitherageoryearofbirth 11.Censusrecordswerecre-atedontheHCUFforalldata-definedpeople.Anyonewhowasnotdata-definedwasawholepersonimputation.Thecountofcensuspeoplewhowerewholepersonimpu-tationswereidentifiedseparatelyfromtheothercensuspeoplewithinsufficientinformationformatching,becausetheyweretreateddifferentlyintheDualSystemEstimator.
Thenumberofwholepersonimputationswassubtractedfromthecensuscountwithinpost-strata.TheE-samplepeoplewhoweredata-definedbutwithinsufficientinfor-mationformatchingwereincludedinthecountoferrone-ousenumerations,andwere,thus,excludedfromthecountofwholepersonimputationsintheDualSystem Estimator.Themailreturncensusformsweredesignedtocollectcharacteristicsforsixpeople.However,spacewaspro-videdforthenamesoftheadditionalresidentsinhouse-holdswithseventotwelvepeople.Thelargehouseholdfollow-upoperationattemptedtoobtaincharacteristicsforthesepeoplebytelephone.Theexceptionwastheenumeratorquestionnaireusedinnonresponsefollow-up.Therewasspaceforfivepeople,butacontinuationformwasusedtorecorddataforper-sonssixandaboveinlargehouseholds.Therewassomeconsiderationgiventousingthenamesinthelongformrosterforpersonsseventhroughtwelveto createpersonrecordsandhavingthemdata-defined.How-ever,itseemedpreferablenottodothis,andtheA.C.E.didnotattempttocreateadditionalcensusdata-defined 11Persononedidnotautomaticallyhavearelationshipofheadofhouseholdlikeitdidin1990,andthetelephonenumberin item2,onthemailreturnquestionnaire,didnotcountasachar-acteristic.Theageanddateofbirthwereexaminedtogether.Ifagewaspresent,age/yearofbirthcountedasacharacteristic.Ifagewasblank,butyearofbirthwaspresent,thentheage/year ofbirthcountedasacharacteristic.Ifageandyearofbirthwere bothblank,theage/dateofbirthdidnotcountasacharacteristic.
ThemonthanddayofbirthwereusedinDressRehearsalinthedeterminationofcountingtheage/dateofbirthasacharacteris-tic,butnotinCensus2000.4-26SectionIChapter4A.C.E.FieldandProcessingActivitiesU.S.CensusBureau,Census2000 peopleforthesepeoplewithonlynamesinlargehouse-holds.Thesepeoplewerewholepersonimputations.
ThenumberofwholepersonimputationsusedintheDual SystemEstimatorwillcorrespondtothecountsusedin thecensus.ComputerCodingofInsufficientInformationforMatchingandFollow-UpfortheESampleTheA.C.E.requiresaminimumamountofinformationformatchingandfollow-up.Thedata-definedcensuspeoplewerereviewedtoidentifytheoneswithsufficientinforma-tionformatchingandfollow-upforA.C.E.Theminimum amountofdatarequiredfordata-definedcensuspeopletohavesufficientinformationformatchingandfollow-upwascompletenameandtwocharacteristics.Completenamewasdefinedas:*Firstname 12,middleinitial,andlastname*Firstnameandlastname*Firstinitial,middleinitial,andlastname TheA.C.E.usedthesamecriteriaforclassifyingageasdata-definedasthecensus,whichisonlyageandyearofbirthwereusedtodetermineifagewaspresentincount-ingcharacteristicstodetermineifthepersonhadenoughdatatobedata-definedinthecensus.Inotherwords,whentheageandyearofbirthwerebothblank,month anddayofbirthwerenotconsidered.ClericalCodingofInsufficientInformationforMatchingandFollow-UpfortheESampleTherewerecaseswherethenamewasnotblank,butwastooincompleteorunlikelytoberealtopermitmatchingandfollow-up.CensusnameslikeMr.Doe,DonaldDuck, andWhiteFemalewerecodedinsufficientinformationby theclericalmatchers.Thecomputercouldnotrecognizenamesthatwerenotrealorwerereallyincompletenames.Theretrievalsystemcontainedanimageofthecensusquestionnaire.Theimageofthecensusquestionnairewasreviewedforcensuspeoplecodedasinsufficientinforma-tionformatchingandfollow-uptoseeiftherewasaddi-tionaldatathatcouldbeusedtoconvertthemtosuffi-cientinformationformatchingandfollow-up.Thedatacapturesystemmayhavehadproblemsreadingthehand writtenentries,ortheremaybeinformationoutsidetheboxesonthecensusquestionnaire.Nameswereobtainedfromtherosterontheimageofthequestionnaireforthe longforms.Childrenwithfirstnamesandnolastnames wereconvertedtosufficientinformationformatchingand follow-upwhenthelastnamecouldbeassumedfroman adultwithfirstandlastnameinthehousehold.These updatestothenameswerecapturedintothematching software,whichwasprogrammedtodecideiftheperson hadsufficientinformationformatchingandfollow-up.P-SampleInsufficientInformationforMatchingandFollow-UpTheP-samplepeoplewerereviewedbycomputertoiden-tifypeoplewithinsufficientinformationformatchingandfollow-up.TheP-samplerulesforsufficientinformationfor matchingandfollow-upwerethesameastheE-samplerules,whichwascompletenameandtwocharacteristics.Casesidentifiedbythecomputerasmissingsufficient informationweresuppressedfromviewingbytheclerical matcherstopreventerrorsinmatchingpeoplewithinsuf-ficientinformationformatching.Therewerefewerthan4,000P-samplepeoplecomputercodedwithinsufficient informationformatchingandfollow-up.Thiscomputerreviewwasestablishedtoavoidcertaintypesofclericalerrorsinmatching.Forexample,nameslikeDKorDontKnow(DorDontisthefirstnameandKorKnowisthelastname),RR(refusedforthefirstnameandrefusedforthelastname),orMSmith,whichcouldnotbematchedwithcertaintyor,iftreatedasanonmatch,followedupwithahighrateofsuccess.Thecensusmight haverecordedapersonwithacompletename,which mightbematchedbyaclerk.Ifmatchingwereallowed,itwouldhavebeenbiasedbywhatwasenumeratedinthecensus.Amatchwouldhaveresultedifthenameswere presentattheaddress,andanonmatchifthenameswerenotinthecensus.SincenameslikeDKcouldnotbefol-lowedup,theywouldhavebeencodedasinsufficient informationformatchingandfollow-up.Therefore,a matchwouldhavebeenassignedwhenthecensusobtainedcompletenames,andunresolvedwhennomatchwasfound.Thebestwaytoavoidabiaswastosuppress theP-samplecasescomputercodedasinsufficientinfor-mationformatchingandtreatthemasunresolved.TheprobabilityofamatchwasimputedfortheP-samplepeoplecodedasinsufficientinformationformatchingandfollow-up.TheyweretreatedinthesamewayasotherP-samplepeoplewithunresolvedmatchstatus.Ifthe wholehouseholdhadinsufficientinformationformatch-ingandfollow-up,thepeoplewereremovedandcon-vertedtononinterviewstatus.
12Theminimumnumberofcharacterstobeanamewastwo.Twocharacterswererequiredinthefirstnameandtwocharac-tersinthelastname.SectionIChapter44-27A.C.E.FieldandProcessingActivitiesU.S.CensusBureau,Census2000 Attachment5.FinalPSamplePersonMatchCodes MatchedM=TheP-sampleandcensuspeoplewerematched.TheP-samplepersonwasaresi-dentofthehousingunitonCensusDay.MR=Thefollow-upinterviewdeterminedthatthematchedpersonwithunresolvedresi-dencestatuswasaresident.MU=TheA.C.E.personwasmatched,butthefollow-upinterviewobtainednousefulinformationtoresolvetheresidencestatusforthematchedpersonwhohada residencestatusofunresolvedbeforefollow-up.TheP-samplepersonsresidencestatuswasunresolved.NotMatchedNP=TheP-samplepersonwasnotmatchedtoacensusperson.Therewasnofollow-upforthewholehouseholdnonmatchesfrompersoninterviewswithhousehold membersandthewholehouseholdnonmatcheswerenotconflictinghousehold
nonmatches.NC=TheP-samplenonmatchwasfoundonthecensusroster.Thispersoninapartialnonmatchhouseholdwasnotmatchedtothecensusbecauseonlynamewascol-lectedinthecensusforthispersoninalargehouseholdandthecensuspersonwasnotdata-defined.Nofollow-upinterviewwasnecessary.NR=TheP-samplepersonwasnotmatchedandwasidentifiedasaresidentintheblockclusteronCensusDayduringtheA.C.E.personfollow-upinterview.NU=TheP-samplepersonwasnotmatched.NotenoughinformationwascollectedduringtheA.C.E.personfollow-upinterviewtoidentifytheP-samplepersonasa residentornonresidentintheblockcluster.TheresidencestatusfortheP-sample personwasunresolved.ThiscodewasalsousedwhentheP-samplepersonwas followeduptocollectgeographicinformationandthatinformationwasnotcol-lected.TheNUcodewasalsousedwhenthepersondidnotliveatthesample addressonCensusDayandtheCensusDayaddresswasnotcompleteenoughto determineiftheCensusDayaddresswasinthesamplecluster.
UnresolvedP=Therewasnotenoughinformationcollectedduringthefollow-upinterviewtodetermineifthepossiblematchwasamatchornot.Thematchstatusofthe P-samplepersonwasunresolved.KI=MatchnotattemptedfortheP-sampleperson,becausethepersonhadinsufficientinformationformatchingandfollow-up.Thenamewasblankorincompleteorthe namewascomplete,butthepersonhadonlyonecharacteristic.Thiswasacom-puterassignedcodeandthesepeopleweresuppressedfromviewbythematch-ers.KP=MatchnotattemptedfortheP-sampleperson,because(1)thenamewasincom-plete,suchasMr.Jones,or(2)thenamewasnotavalidname,suchasWhite FemaleorDonaldDuck.Thiswasaclericallyassignedcode.4-28SectionIChapter4A.C.E.FieldandProcessingActivitiesU.S.CensusBureau,Census2000 RemovedfromthePSampleFP=TheP-samplepersonwasfictitiousinthisblockcluster.Thepersonwasinter-viewedinerrorduringthepersoninterview.Thispersonwasnotincludedinthe finalPsample.NL=TheP-samplepersondidnotliveatthesampleaddressorintheblockclusteronCensusDayandwaslistedasanonmoveroroutmoverinerror.Thispersonwas removedfromthelistofP-samplepeople,sinceheorshewascollectedduringthe personinterviewinerror.NN=TheP-samplepersonwasidentifiedasanonresidentintheblockclusteronCen-susDayduringtheA.C.E.personfollow-upinterview,becausethepersonlivedingroupquartersonCensusDay,orhadanotherresidencewherethepersonshould havebeencountedonCensusDayaccordingtocensusresidencerules.Thisper-sonwasremovedfromthelistofP-samplepeople,sinceheorshewascollected duringthepersoninterviewinerror.DP=TheP-samplepersonwasaduplicateofanotherP-sampleperson.
MN=TheA.C.E.personfollow-upinterviewdeterminedthatthematchedpersonwithunresolvedresidencestatuswasnotaresidentinthishousingunitorinthisblock cluster.ThepersonwasnolongerinthelistofP-samplepeople.GP=TheP-samplepersonwasremoved,becausethepersoninterviewwasconductedatahousingunitthatexistsoutsidethesamplecluster.Thepersonfollow-upiden-tifiedthishousingunitasaP-samplegeocodingerror.SectionIChapter44-29A.C.E.FieldandProcessingActivitiesU.S.CensusBureau,Census2000 Attachment6.E-SamplePersonEnumerationCodes Correctly EnumeratedM=TheP-sampleandE-samplepeoplewerematched.TheE-samplepersonwascor-rectlyenumerated.CE=TheE-samplenonmatchwasidentifiedascorrectlyenumeratedduringtheA.C.E.personfollow-upinterview.MR=TheA.C.E.personfollow-upinterviewdeterminedthatthematchedpersonwithunresolvedresidencestatuswasaresident.
Erroneously Enumerated 13GE=TheE-samplepersonwaserroneouslyenumeratedinthisblockcluster,becausethecensushousingunitwasageocodingerror(i.e.,countedinthewrongblock cluster).TheE-samplepersonshouldhavebeenenumeratedelsewhereinthe
census.EE=TheE-samplenonmatchwasidentifiedduringthepersonfollow-upinterviewaserroneouslyenumerated.FE=TheE-samplenonmatchwasdeterminedtobefictitiousinthisblockclusterduringthefollow-upinterview.Thepersonmayhaveexisted,butshouldnothavebeenenumeratedinthecensuswithinthisblockcluster.TheE-samplepersonwaserro-neouslyenumeratedinthecensusinthisblockcluster.DE=TheE-samplepersonwasaduplicateofanotherE-sampleperson.ThecodewasalsousedwhentheE-samplepersonwasaduplicateofacensuspersoninasur-roundingblock.ThepeopleintheE-samplehousingunitwereerroneouslyenu-merated,becausetheywerecountedaccuratelyinthesurroundingblockand duplicatedinthesamplecluster.MN=TheA.C.E.personfollow-upinterviewdeterminedthatthematchedpersonwithunresolvedresidencestatuswasnotaresidentinthishousingunitorinthisblock cluster.TheE-samplepersonwasanerroneousenumeration.KE=MatchnotattemptedfortheE-sampleperson.Thenamewasblankorincompleteorthenamewascomplete,butthepersonhadonlyonecharacteristic.Thename wasincompleteornotavalidname,suchasChildJones,orMickeyMouse.
13TheE-samplepeoplewhowereduplicatedwithnon-E-samplepeoplewerenotfullerroneousenumerations.SeethesectiononDuplicateSearchWithinClusterinthischapterforadiscussionoftheprobabilityoferroneousenumerationwhentherewasduplication betweenacensuspersonintheEsampleandanon-E-sampleperson.4-30SectionIChapter4A.C.E.FieldandProcessingActivitiesU.S.CensusBureau,Census2000 UnresolvedUE=NotenoughinformationwascollectedduringtheA.C.E.personfollow-upinter-viewtoidentifytheE-samplepersonascorrectlyorerroneouslyenumeratedinthe blockcluster.TheenumerationstatusfortheE-samplepersonwasunresolved.The UEcodewasalsousedwhenthepersondidnotliveatthesampleaddressonCen-susDayandtheCensusDayaddresswasnotcompleteenoughtodetermineifthe CensusDayaddresswasinthesamplecluster.Thiscodewasalsousedwhenthe E-samplepersonwasfolloweduptocollectgeographicinformationandthatinfor-mationwasnotcollected.MU=TheE-samplepersonwasmatched,butthefollow-upinterviewobtainednousefulinformationtoresolvetheresidencestatusforthematchedpersonwhohadaresi-dencestatusofunresolvedbeforefollow-up.TheE-samplepersonsenumeration statuswasunresolved.P=Therewasnotenoughinformationcollectedduringthefollow-upinterviewtodetermineifthepossiblematchwasamatchornot.Thematchstatusofthe P-samplepersonwasunresolved.GU=Thegeographicworkforthetargetedextendedsearchwasunresolved.Thecodehadthesamedefinitioninboththebeforeandafterfollow-upmatching.Thedif-ferencewasinafterfollow-up,thecodewasonlyusedinthelist/enumerateclus-ters.Thefieldworkforthetargetedextendedsearchwasnotdoneortheblock numberontheformwasnotinthesurroundingblocks,intheblockcluster,oron themap.Itwasnotclearwherethehousingunitwaslocated.SectionIChapter44-31A.C.E.FieldandProcessingActivitiesU.S.CensusBureau,Census2000 Attachment7.FinalP-SamplePersonResidenceStatusCodes ResidentM=TheP-sampleandcensuspeoplewerematched.MR=Thefollow-upinterviewdeterminedthatthematchedpersonwithunresolvedresidencestatuswasaresident.NR=TheP-samplepersonwasnotmatchedandwasidentifiedasaresidentintheblockclusteronCensusDayduringtheA.C.E.personfollow-upinterview.The P-samplepersonwasmissedinthecensus.NP=TheP-samplepersonwasnotmatchedtoacensusperson.Therewasnofollow-upforthewholehouseholdnonmatchesfrompersoninterviewswithhousehold membersandthewholehouseholdnonmatcheswerenotconflictinghousehold nonmatches.Thesepeoplewereconsideredresidentsofthehousinguniton CensusDay.NC=TheP-samplenonmatchwasfoundonthecensusroster.Thispersoninapartialnonmatchhouseholdwasnotmatchedtothecensusbecauseonlynamewas collectedinthecensusforthispersoninalargehouseholdandthecensuspersonwasnotdata-defined.Nofollow-upinterviewwasnecessary.
NonresidentFP=TheP-samplepersonwasfictitiousinthisblockcluster.Thepersonwasinter-viewedinerrorduringthepersoninterview.Thispersonwasnotincludedinthe finalPsample.NL=TheP-samplepersondidnotliveatthesampleaddressorintheblockclusteronCensusDayandwaslistedasanonmoveroroutmoverinerror.ThispersonwasremovedfromthelistofP-samplepeople,sinceheorshewascollectedduringthe personinterviewinerror.NN=TheP-samplepersonwasidentifiedasanonresidentintheblockclusteronCensusDayduringtheA.C.E.personfollow-upinterview,becausethepersonlived ingroupquartersonCensusDayorhadanotherresidencewherethepersonshouldhavebeencountedonCensusDayaccordingtocensusresidencerules.ThispersonwasremovedfromthelistofP-samplepeople,sinceheorshewas collectedduringthepersoninterviewinerror.DP=TheP-samplepersonwasaduplicateofanotherP-sampleperson.
MN=TheA.C.E.personfollow-upinterviewdeterminedthatthematchedpersonwithunresolvedresidencestatuswasnotaresidentinthishousingunitorinthisblock cluster.ThepersonwasnolongerinthelistofP-samplepeople.GP=TheP-samplepersonwasremovedbecausethepersoninterviewwasconductedatahousingunitthatexistsoutsidethesamplecluster.Thepersonfollow-up identifiedthishousingunitasaP-samplegeocodingerror.4-32SectionIChapter4A.C.E.FieldandProcessingActivitiesU.S.CensusBureau,Census2000 UnresolvedMU=TheA.C.E.personwasmatched,butthefollow-upinterviewobtainednousefulinformationtoresolvetheresidencestatusforthematchedpersonwhohada residencestatusofunresolvedbeforefollow-up.TheP-samplepersonsresidence statuswasunresolved.NU=TheP-samplepersonwasnotmatched.NotenoughinformationwascollectedduringtheA.C.E.personfollow-upinterviewtoidentifytheP-samplepersonasa residentornonresidentintheblockcluster.TheresidencestatusfortheP-sample personwasunresolved.ThiscodewasalsousedwhentheP-samplepersonwas followeduptocollectgeographicinformationandthatinformationwasnotcol-lected.TheNUcodewasalsousedwhenthepersondidnotliveatthesample addressonCensusDayandtheCensusDayaddresswasnotcompleteenoughto determineiftheCensusDayaddresswasinthesamplecluster.P=Therewasnotenoughinformationcollectedduringthefollow-upinterviewtodetermineifthepossiblematchwasamatchornot.Theresidencestatusofthe P-samplepersonwasunresolved.KI=MatchnotattemptedfortheP-sampleperson,becausethepersonhadinsufficientinformationformatchingandfollow-up.Thenamewasblankorincompleteorthenamewascomplete,butthepersonhadonlyonecharacteristic.Thiswasacomputerassignedcodeandthesepeopleweresuppressedfromviewbythe
matchers.KP=MatchnotattemptedfortheP-sampleperson,because(1)thenamewasincom-plete,suchasMr.Jones,or(2)thenamewasnotavalidname,suchasWhite FemaleorDonaldDuck.Thiswasaclericallyassignedcode.SectionIChapter44-33A.C.E.FieldandProcessingActivitiesU.S.CensusBureau,Census2000 FORM D-1302 (6-23-99)Section 4-LISTINGPAGE Hello, Im (Your name) from the U.S. Bureau of the Census. Heres my identification. We are listing addresses as part of the Census 2000, and I have a few questions to ask you.
Fill items 9 and 10 in areas without city style addresses (see cover, Section 1, item 5).
(9) Householder name MI Last MULTI-UNIT ADDRESS
'(15a)Canvass the multi-unit basic address and enter the number of units on each floor.
Basement (15d)If there is a difference between your canvass and respondent total, resolve the difference.
Enter the correct total in 16.
(16)Total number of housing units, occupied or vacant, at
this basic address.
HH member*
Proxy*(1)Line No.(2)Block No.(6)Map Spot No.-
Do not fill for Mobile Home Park.
(3)House No.(4a) Road/street name (5a) Rural (7)PO Box No.
(8)ZIP Code (11a)How would you describe this type of address?
(12)REMARKS (Do NOT use this
space for location description.)
1st floor 2nd floor 3rd floor 4th floor (17)Information obtained from:
6th floor Manager*Observation (10)Physical location description or E-911 address (Maximum 50 characters) 1One family house (occupied or vacant)-
Detached or attached to one or more
buildings -
Go to 11b.
Basic address with two or more housing units(example: apartment house)-
Go to 15a.
However, if under construction or future construction, skip 15 and go to 16.
Mobile home/trailer, NOT in a park-Go to 17.One family house (occupied orvacant) in special place-In 12, enter name of special place and contact person. Then go to 13.
Storage of household goods - Go to 17.SINGLE UNIT ADDRESS (14)Besides the unit(s) you have just mentioned, has this address been converted into apartments where other people might live?Yes - How many ADDITIONAL apartments?
...'If sum of 13 plus 14 is:1 -
Go to 16 and 17.
No 10th floor (15b)Total units from your canvass.(15c)How many apartments, occupied or vacant, are there at (basic address)
?8th floor 9th floor Attic If more floors, enter additional floor information (example: "5 APTS 14th FLR")
7th floorMobile home/trailer park-Go to Mobile Home Park Page, Section 6.2 or more-Change item 11a to "Basic address with two or more housing units" and go to 15a.
1 (11b)Unit status Other, for example:
Occupied camper,tent, van, boat, etc.-
Go to 17.Under construction (started) .
Other - Go to 12, then 17.1 2 4 5 6 2 3 4 5 7 If multi-unit, go to the Multi-Unit Page.
OFFICE USE ONLY 5th floor11th floor12th floor 12 1 2 3 4 Go to 12, then 17.
(13)How many (housing units/living quarters/apartments), occupied or vacant, are there at (basic address)
? Example: basement apartment, garage apartment..
Occupied, or vacant and intended for
occupancy -
Go to 13.0001 8 (5b) Box No.NumberLetter Basic address with two or morehousing units in special place-In 12, enter name of special place and
contact person. Then go to 15a.
6 Unfit for habitationBoarded up.
Future construction (not started) 3*Respondent nameTelephone No.
Rte. No.First Fill items 9 and 10 in areas without city style addresses (see cover, Section 1, item 5).
(9) Householder name MI Last MULTI-UNIT ADDRESS
'(15a)Canvass the multi-unit basic address and enter the number of units on each floor.
Basement (15d)If there is a difference between your canvass and respondent total, resolve the difference.
Enter the correct total in 16.
(16)Total number of housing units, occupied or vacant, at this basic address.
HH member*
Proxy*(1)Line No.(2)Block No.(6)Map Spot No.-
Do not fill for Mobile Home Park.
(3)House No.(4a) Road/street name (5a) Rural (7)PO Box No.
(8)ZIP Code (11a)How would you describe this type of address?
(12)REMARKS (Do NOT use this space for location description.)
1st floor 2nd floor 3rd floor 4th floor (17)Information obtained from:
6th floor Manager*Observation (10)Physical location description or E-911 address (Maximum 50 characters) 2One family house (occupied or vacant)-
Detached or attached to one or more
buildings -
Go to 11b.
Basic address with two or more housing units(example: apartment house)-
Go to 15a.
However, if under construction or future construction, skip 15 and go to 16.
Mobile home/trailer, NOT in a park-Go to 17.One family house (occupied orvacant) in special place-In 12, enter name of special place and contact person. Then go to 13.
Storage of household goods - Go to 17.SINGLE UNIT ADDRESS (14)Besides the unit(s) you have just mentioned, has this address been converted into apartments where other people might live?Yes - How many ADDITIONAL apartments?
...'If sum of 13 plus 14 is:1 -
Go to 16 and 17.
No 10th floor (15b)Total units from your canvass.(15c)How many apartments, occupied or vacant, are there at (basic address)
?8th floor 9th floor Attic If more floors, enter additional floor information (example: "5 APTS 14th FLR")
7th floorMobile home/trailer park-Go to Mobile Home Park Page, Section 6.2 or more-Change item 11a to "Basic address with two or more housing units" and go to 15a.
1 (11b)Unit status Other, for example:
Occupied camper,tent, van, boat, etc.-
Go to 17.Under construction (started) .
Other - Go to 12, then 17.1 2 4 5 6 2 3 4 5 7 If multi-unit, go to the Multi-Unit Page.
OFFICE USE ONLY 5th floor11th floor12th floor 12 1 2 3 4 Go to 12, then 17.
(13)How many (housing units/living quarters/apartments), occupied or vacant, are there at (basic address)
? Example: basement apartment, garage apartment..
Occupied, or vacant and intended for occupancy -
Go to 13.8 (5b) Box No.NumberLetter Basic address with two or morehousing units in special place-In 12, enter name of special place and
contact person. Then go to 15a.
6 Unfit for habitationBoarded up.
Future construction (not started) 3*Respondent nameTelephone No.
Rte. No.First A.C.E. Field and Processing Activities U.S. Census Bureau, Census 20004-34Section I Chapter 4 Figure 4-1.
Address Listing Book Page for Single and Multiunit Structures
999 Chapter5.TargetedExtendedSearch INTRODUCTIONTheconceptbehindthedualsystemestimateistoesti-matethecensusomissionrateusingthePsampleandthe erroneousenumerationrateusingtheEsample.Thecom-pletedefinitionofbeingomittedfromorerroneouslyenu-meratedinthecensusincludestheconceptoflocation, thatis,asuccessfulenumerationmusthavelocatedthepersonintherightplace.Rightlocationinthecensusmeansanywhereintheblockwherethereportedhousing unitaddresswaslocated,orinthesearcharea,defined asoneringofadjacentblocks.TheoperationconcernedwithlocatingandmatchingthepersonsinthesurroundingareasisTargetedExtendedSearch,orTES.Thename waschosenbecause,unlikethesimilarprocedureinthe1990Post-EnumerationSurvey(PES)wherethesurround-ingareaofeveryclusterwassearched,theA.C.E.search wastargetedintwoways:1.Resultsfromtheinitialhousingunitmatchingopera-tionwereusedtoselectthehousingunitsthatarecandidatesforTES.2.Inmostcases,onlyclustersthatincludeTES-eligiblehousingunitswereincludedinTES.ThischapterfocusesonthestatisticalmethodsusedinTES.A.C.E.fieldandprocessingactivities,includingTES, aredescribedinChapter4.
OverviewThe1990PESincludedasearchinallblockssurroundingeachsamplecluster.Everypersonineveryhouseineveryblockadjoiningeverysampleblockclusterwasincluded inthesearch.Thiswasdeterminedtobeburdensomein termsoftime,cost,andperhapsmentalfatigueonthepartofmatchersperforminglow-payoffsearches(Hogan,1993).Toimproveefficiency,theCensus2000A.C.E.took amorefocused(i.e.targeted)approachinselectingclus-ters,definingsearchareas,anddeterminingwhichhous-ingunitsandresidentswouldbepartofsurrounding blockoperations.TheCensus2000A.C.E.searchoperationdifferedfromthe1990PESinfourprimaryways:1.Searchareadefinition.2.Amountofsearching.3.Personseligibleforsearch.4.TESweighting.SearchAreaDefinitionThesearchareaforthe2000A.C.E.waslimitedtoeitherjustthesampleblockclusteroroneringofadjacentblocks.Anadjacentblockisonethattouchestheclusterofsampleblocksatoneormorepoints.Thisdefinition includestheblocksthattouchthecorneroftheblockclus-ter.Resultsfromempiricalresearch,usingCensus1998DressRehearsaldata,showthattheadditionalbenefits ofusingtworingsofsurroundingblocksarenegligible(Wolfgang,1999).AmountofSearchingThereweretwoimportantdifferencesbetweentheextentofsearchinginthe1990PESandthe2000A.C.E.:1.Onlyabout20percentofA.C.E.blockclustershadtheirsurroundingareassearched,whereasin1990the surroundingareaofeveryblockclusterwassearched.2.Thesearchwastargeted(inmostcases)toonlyhousingunitsidentifiedasbeinglikelytoexhibit geocodingerror;in1990,allpersonsinsurrounding areaswereeligibleforsearch.Theclusterswithahighnumberofpotentialgeocodingerrorswereidentifiedfromtheresultsoftheinitialhous-ingunitmatchingoperationandsubsequentfieldfollow-up(seeChapter4).ThesewereA.C.E.blockclus-terswithalargenumberofIndependentListinghousing unitsnotfoundintheJanuary2000CensusAddressList.Thesetypesofnonmatchesarepossiblycensusgeocodingerrorsofexclusion(i.e.notincludedinthecensuswithin thesampleareaalthoughtheyshouldhavebeen).Onthecensusside,A.C.E.blockclusterswithalargenumberofcensusgeocodingerrorsarelikelytobeerrorsofinclusion (i.e.reportedbythecensusintheblockcluster,although theunitisphysicallyoutsidetheA.C.E.blockcluster).ThesetwotypesofhousingunitswereeligibletobeintheextendedsearchaspartofTESoperations,andarethus TES-eligiblehousingunits.Anyclusterthatincludedatleastonepotentialcensusgeocodingerror,ofeitherinclusionorexclusion,waseli-gibletohaveTESoperationsperformedinitandistermed aTES-eligiblecluster.Clusterswithnosuchpotentialgeocodingerrorsbecame non-TES-eligible.TheclustersinwhichTESwasactuallydoneareTESclusters, andwereselectedfromamongtheTES-eligibleclusterseitherwithcertaintyorbyprobabilitysampling.SectionIChapter55-1TargetedExtendedSearchU.S.CensusBureau,Census2000 Resultsfromthe1990PESshowthatgeocodingerrorsarehighlyclustered.Slightlyover77percentofthewhole householdnonmatcheswereconcentratedinlessthan one-fourthofthePESsampleblockclusters.Ontheother hand,about72percentofthecensusgeocodingerrors werefoundinlessthan3percentofthePESsampleblock clusters.TESisagoodexampleofDemings80-20 guideline80percentofthebenefitsarerealizedbysolv-ing20percentoftheproblems.PersonsEligibleforSearchInordertobeincludedinTESoperations,apersonmustlivein:*aTEScluster;and*aTES-eligiblehousingunitApersoninahousingunitthatwasnotaTES-eligiblehousingunit,wasanon-TESperson,andthus,wasnot directlyaffectedbyTESoperations.AnypersoninaTES-eligiblehousingunitwasaTESperson,unlesssomeoneinthehousingunitmatched(i.e.someoneis confirmedtobenotaTESperson).TESpersonsinclusters thatwerenotselectedforTESoperationswereidentified,butdidnothaveTESoperationsapplied.Instead,thesecaseswereeffectivelyremovedfromthesamplebyhaving anassignedweightofzero.TheywererepresentedbypersonsinotherTESclustersselectedbysampling.TESWeightingEveryselectedTESclusterwasassignedasamplingweightequaltothereciprocalofitsselectionprobability.ThisTESclusterweightwasassignedtoallTESpersonsinthat clusterandwasmultipliedbytheirA.C.E.samplingweightstoproducetheirTES-adjustedweights.TheTES-adjustedweightforTESpersonsinclustersnotselected forTESiszero.Inthisway,theTESpersonsintheTESclustersrepresenttheTESpersonsinnon-TESclusters.Allelementsofthedualsystemestimate(DSE)calculation, exceptthoseinvolvinginmovers,canbeaffectedbythe TESweightingbecauseTESpersonscanbenonmoversoroutmovers,matchesornonmatches,andcorrectorerroneousenumerations.CLUSTERSAMPLINGThedecisiontoselect20percentoftheA.C.E.blockclus-tersforTESwasbasedontheassumptionthatmostoftheTES-eligiblehousingunitsandpersonswouldbeconcen-tratedinasmallfractionoftheblockclusters.Hence, mostofthebenefitsofacompletesurroundingarea searchcouldberealizedatasubstantialreductionincost,ifadisproportionateshareoftheeffortwasconcentratedintheclusterswiththegreatestlikelypayofftheones withthemostTES-eligiblehousingunitsandpersons.Targetingtheseclusterswouldachieveoneoftheprinci-palgoalsofthesurroundingareassearchvariancereduction.However,itisofatleastequalimportancethatthesurroundingareasearchbebalanced.Therearetwo waysTEScouldhavebeenoutofbalance:1)thegeo-graphicalareaincludedinthesearchcouldhavediffered betweenthePandEsamples;2)theTESblockcluster samplingcouldhaveselectedclusterscontainingerrorsof inclusionwithgreaterorlesslikelihoodthanclusterswith errorsofexclusion.Toachievethebalancinginsample selection,itwasnecessaryforeachclusterwith TES-eligiblehousingunitsandpersonstohavesome probabilityofbeingselectedforTESandbeweightedby theinverseoftheselectionprobability.TheinformationavailableforTESselectionincludedtheresultsoftheinitialhousingunitmatching,whichincludedtheresultsfromthehousingunitfollow-up.
Housingunitfollow-upindicated,amongotherthings,thecountofpotentialgeocodingerrorsofinclusionandexclu-sion.Thegeocodingerrorsofinclusionwerecensusunits foundoutsidethecluster.Potentialgeocodingerrorsofexclusionwerecodedasaddressnonmatchesintheindependentlisting.Thecombinednumberofcensus geocodingerrorsandindependentlistingaddressnon-matcheswereconsideredtobethenumberofpotentialgeocodingerrorsineachcluster.Theprobabilitythatany clusterwouldbeselectedforTESdependedonthecount ofitspotentialgeocodingerrorsformostclusters.ExceptionsarerelistedclustersandclustersthatwereenumeratedinthecensususingtheList/Enumeratemeth-odology.Thoseclustersdidnotgothroughhousingunitmatchingandfollow-up.AhousingunitthatrepresentedapotentialgeocodingerrorcouldhavebeendiscoveredbyTESoperationstobeageocodingerrororanactualcoverageerror.Puttinga particularhousingunitinthecategoryofpotentialmadeit,andthepersonslivinginit,eligibleforTES.Thissearchwasintendedtodeterminewhetherthehousingunitand personsweregeocodedincorrectlyintoaneighboring block,inwhichcasetheywouldbecountedascorrectlyenumerated,orweretrulyenumerationerrors.Hence,thefollowingTESselectionstrategywasimple-mented:*ClustersthatdidnothavecountsofpotentialgeocodingerrorsavailableatthetimeoftheTESsamplingopera-tionwereassignedtoaseparateTESprocedure.Clus-tersthatwererelisted(whichwerelaterincludedinTESwithcertainty)orenumeratedusingtheList/Enumeratemethodology(whichwereultimatelyexcludedfromTES) fallintothisgroup.*The5percentofclustersthatincludedthelargestnum-berofhousingunitsthatwerepotentialgeocoding errorswereincludedinTESwithcertainty.*The5percentofclustersthathadthemosthousingunitsthatwerepotentialgeocodingerrors,whenweightedbytheirA.C.E.clusterweight,werealso5-2SectionIChapter5TargetedExtendedSearchU.S.CensusBureau,Census2000 includedinTESwithcertainty.The5percentofclustersincludedintheabovebulletforhavingthemost unweightedcaseswereexcludedbeforethisstepwas performed,sothatatotalof10percentoftheA.C.E.
clusterswereselectedbasedonthetwocertainty
criteria.*Allclusterswithatleastonepotentialgeocodingerrorhousingunitwereassignedtoanoncertaintystratumto besampledatauniformnationalratetobeincludedin TES.Thesamplingratewassetsothattheoverallsize oftheTESsample,includingthoseselectedbycertainty andbysampling,totaled20percentofA.C.E.clusters (excludingthefirstgroup).ClusterswithnopotentialgeocodingerrorswereexcludedfromTESselectionsincetherewerenohousingunitsor personsthatwerecandidatesforTESoperations.ThiscreatesapotentialforasmallbiasinTES,becausehous-ingunitsaddedtoordeletedfromtheaddresslistsaftertheselectionofTESclusterswerenoteligibleforTES operations.SamplingMethodologyFortheUnitedStatesasawhole,therewere11,303A.C.E.clusters.Ofthese,420wereexcludedfromTESselection becausetheyusedtheList/Enumeratecensusmethod.Oftheremaining10,883clusters,20percent,or2,177wereselectedforTES.Oftheeligibleclusters,62wererelist clustersandwerenotpartofthenormalTESselection.
(Theseclustersdidnotcountaspartofthe2,177TEStargetsamplesize.)Fivepercentofthesamplinguniverse,or544clusters,withthemostpotentialgeocodingerrors wereselectedforTESwithcertaintyandassignedaTESweightof1.Oftheremainingclusters,anadditional544withthemostpotentialgeocodingerrors,whenweighted bytheA.C.E.clusterweight,werealsoselectedwithcertaintyandassignedaTESweightof1.Oftheremainingclustersthatincludedatleastonepoten-tialgeocodingerror,1,089wereselectedusingsystematicrandomsamplingwithequalprobability.Therewere5,326 clustersinthenoncertaintystratum(i.e.allthosethatwerenotalreadyselectedbyoneoftheothermeansandthatcontainedatleastonepotentialgeocodingerrors),
sotheselectedclusterswereassignedaTESweightof5,326dividedby1,089or4.8907.Theremaining4,407clusterswereoutofscopeforTESbecausetheyhadno identifiedpotentialgeocodingerrors.Forpurposesofdrawingthesystematicsample,clustersweresortedintheorder:*State*First-phaseSamplingStratum
- Second-phaseSamplingStratum*SmallBlockClusterSamplingStratum*ClusterNumber ThefirstfourcharacteristicsarethesameonesusedtoselecttheA.C.E.sample.Sortingclustersinthisorderfor TESimprovedtherepresentativenessofTESwithrespecttothenationalA.C.E.sample.Aftersortinginthisorder,theclustersweresystematicallysampledwithequalprob-abilityusingatake-everyof4.8907andwereassignedaTESweightequaltothatfigure.ResultsofClusterSamplingTheTESsampleincluded2,239blockclustersoutof11,303,or19.8percent.(OriginallyithadbeenintendedtoincludeasmallnumberofList/EnumerateclustersinTES,andsomesamplewassetasideforthembutnever used.)Theclustersincluded45,000E-sampleand77,000P-samplehousingunits,representing80and73percentofTES-eligibleunitsintheirrespectivesamplesbeforesub-samplingwithinlargeblockclusterswasperformed.
Becauseofdifferencesinprocedures,moreE-sampleunitsgotintoTESbycertainty(76percentversus66percent),whilemoreP-sampleunitswereselectedbysampling,7 percentto5percent.TESunitsrepresentabout7percentofthehousingunitsresultingfrominitialhousingunitmatching.(SeeTable5-1.)Thiswasnotthefinalnumber ofhousingunitsincludedinTESfieldoperationsbecause:*SubsamplingwithinlargeblockclustersreducedthenumberofA.C.E.housingunitsinclusterswith80or morehousingunits;and*HousingunitcountsforRelistclusterswerenotavailableatthetimethesamplewasselectedSubsamplingwithinlargeblockclustersreducedthefinalTESworkloadto12,000E-sampleand18,000P-sample housingunits.SectionIChapter55-3TargetedExtendedSearchU.S.CensusBureau,Census2000 Table5-1.TESSamplingFrameandSelectionResults ClustersPotentialgeocodingerrorsTotalpotentialerrorsErrorsofinclusionErrorsofexclusionNumberPercentNumberPercentNumberPercentTotal.................11,303122,44010045,05310077,387100 Out-of-scope
..............4,8270...0...0...List/Enumerate.......4200...0...0...NoTESHUs.........4,4070...0...0...EligibleforTES.
...........6,476122,44010045,05310077,387100 Certainty..................1,15085,3097034,0897651,22066Topweighted.........54411,858104,03797,82110Topunweighted......54473,4516030,0526743,39956
Relist...............620*...0*...0*...
Noncertainty
..............5,32637,1313010,9642426,16734Selectedintosample..1,0897,64262,10655,5367 Notselected
.........4,23729,489248,8582020,63127TESclusters
..............2,23992,9517636,1958056,75673*TESunitsinRelistclustershadnotbeendeterminedatthetimethesamplewasselected.Note:Percentagesintablemaynotaddtototalduetorounding.TESFIELDANDPROCESSINGACTIVITIESDetailsontheoperationsinvolvedinTESaredescribedinChapter4.Insummary,themainactivitiesare:
- Clusterselection(Spring2000).ThisoperationselectstheclustersforTES.Becauseoftheneedto selecttheclustersampleataparticulartime,thefinalEandPsampleshadnotbeenselectedatthetimeofthisoperation.
- Searchforcensusunitsinsurroundingblocks(Summer2000).Determinesifcensusunitserrone-ouslyincludedinthesampleblockclusterarelocatedwithinthesurroundingringofblocks.Thisfieldopera-tionisdescribedmorefullyinChapter4.
- IdentifyTESPersons(Fall2000).AnautomatedactivityperformedattheNationalProcessingCenterinJeffersonville,Indiana.SeeChapter4formore information.
- ExtendthesearchareatosurroundingblocksforTESpersons(Fall2000).TheP-sampleTESpersonswereallowedtomatchtocensusrecordsinthesur-roundingblock.TheE-sampleTESpersonsweretreated ascorrectenumerationsifthecensusunitwaslocatedinasurroundingblock.Thiswasaclericaloperation.
- AssignTESweights(Winter2000/2001).TESper-sonsidentifiedinTES-eligibleclusterswereassignedthe TESweightassociatedwiththatcluster,either1.0foraclusterselectedwithcertaintyor4.8907foraclusterselectedbysampling.TESpersonsinTESclustersnot selectedintothesamplewereassignedazeroweight.AddsandDeletesThepreliminarycensusaddresslistofhousingunitsasofJanuary2000wasthesourcefortheinitialhousingunitmatchingonwhichTESisbased.SincesomehousingunitsontheJanuary2000listwerelaterdeletedandothers added,thefinallistofcensushousingunitsdidnot exactlymatchtheinitialhousingunitmatchingcounts ofpotentialgeocodeerrors.Therefore,procedureswere necessarytoupdatetheTESidentificationsforaddsand
deletes.Inthevastmajorityofcases,whereaddsanddeleteswerenotinvolved,P-samplehousingunitsareTES-eligibleiftheydidnotmatchtoacensusaddress.However,ifaP-sampleunitwasmatchedtoanaddressduringinitialhousingunitmatching,butthataddresswasdeleted,then theunitwasconsiderednonmatched.Toadjustfordele-tions,P-samplepersonsinhousingunitsthatwerematchedtodeletedcensushousingunitswereflagged asTESpersons,aslongastheunitdidnotcontainanypersonsmatchedwithinthesampleblock(i.e.non-TESpersons).Thisadjustmentwasperformedonlyonpersons inTESclusters.E-samplehousingunitsthatwereaddedtothefinalcen-suslistafterJanuary2000couldrepresentgeocodingerrors,buttheywerenotpartofTESfieldoperations.
Withoutfieldoperations,personsinsuchunitswouldneverbeidentifiedassurroundingblockcorrectenumera-tions.Therefore,acorrectenumerationprobabilitywas imputedforsuchpersonsinTESclusters.Theimputed probabilityistheoverallcorrectenumerationprobabilityofallresolvedpersonsingeocodingerrorhousingunitsintheTESsample.SeeChapter6foradescriptionofthepro-cedure.5-4SectionIChapter5TargetedExtendedSearchU.S.CensusBureau,Census2000 Table5-2.EffectofCensusAddressListChangesafterJanuary2000CountWeightedMatches/correct enumerationsPsample-personsinhousingunitsmatchedtodeletes..2,3192,036,564675,892Esample-geocodeerroradds.........................5315,30714,915TESINDUALSYSTEMESTIMATIONAccountingforTESintheDSEcalculationisprimarilyamatterofapplyingweightsproperly.Everypersoninthe A.C.E.iseitheraTESpersonoranon-TESperson,andeveryA.C.E.clusteriseitheraTESclusteroranon-TEScluster.EveryTESpersonisassignedtheTESweightofhis A.C.E.cluster.ThecalculationoftheDSErequirestheuseofsevendistinctcomponents,allbutoneofwhichrepre-sentsthesumoftheA.C.E.weightsforsomegroupof personsintheA.C.E.,includingbothTESandnon-TESper-sons.Hence,sixofthesevencomponentsrepresentsaweightedsumofTESandnon-TESpersons,theformer withtheirTESclusterweightsapplied.ApplyingTESWeightsEveryA.C.E.clusterincludingTESpersonshasaTESweight,althoughthatweightiszeroiftheclusterisnotselectedforTES.ATESpersonmustbeweightedbythe associatedTESweight.TheA.C.E.weightismultipliedbytheTESweighttoproduceapersonweight.TESweightingdoesnotaffecttheweightofnon-TESpersons.Theirindi-vidualweightsarethesameastheA.C.E.weights.Table5-3.TESWeightsbyTESStatusofthePersonandClusterTESclusterNon-TESclusterTESpersons...........1,ifclusterinTESwithcertainty 04.8907,ifclusterselectedforTESbysampling0Non-TESpersons.......11Theissuesrelatedtoinmovers,outmoversandnoninter-viewsarethesameforTESpersonsasforallotherper-sons.Fromacalculationstandpoint,theonlyeffectthat TESstatushasonthedualsystemestimatesisinapplyingtheclustersTESweight.DSECalculationTheDSEforCensus2000is:
DSE(DD)(CE N e)(N nN i M n(M o N o)N i)where DDcensusdata-definedpersons CEestimatednumberofA.C.E.E-Samplecorrectenumerations N enumberofA.C.E.E-Samplepersons N nestimatednumberofA.C.E.P-Sample nonmovers N iestimatednumberofA.C.E.P-Sampleinmovers N oestimatednumberofA.C.E.P-Sample outmovers M nestimatednumberofA.C.E.P-Samplenonmovermatches M oestimatednumberofA.C.E.P-SampleoutmovermatchesTheestimatorhassevenA.C.E.distinctcomponents(plusDDfromthecensusenumeration).Sixofthesevencom-ponentsrepresentaweightedsumofpersons,including bothTES-andnon-TESpersons.Otherthaninmovers,whocannotbeTESpersons,eachoftheDSEcomponentsisexpressedas:
ni1 n pj1 w ij*m ij x ijni1 n pj1 w ij*m ij y ijni1 n pj1 w ij*t ij m ij z ijwhere iclusterindex jpersonindex nnumberofblockclustersintheA.C.E.sample n pnumberofpersonsinblockclusteri x ij1ifthepersonisnotaTESperson,0 otherwise y ij1ifthepersonisaTESpersonandisintheTESsamplewithcertainty,0otherwise z ij1ifthepersonisaTESpersonandisintheTESsystematicsample,0otherwise m ijcharacteristicofinterest,match,correctenumeration,E-sampleperson,orP-sampleperson w ij*weightusedforestimation(includesinverseoftheprobabilityofselectionforA.C.E.,adjustmentforhouseholdnoninterview andweighttrimming) t ijTESsamplingweight,theTESsystematicsample take-everyEFFECTSOFTESONDUALSYSTEMESTIMATIONTheprincipaleffectofTESinCensus2000isapproxi-matelywhatwasexpected-theoverallcorrectenumera-tionratewas2.9percenthigherwithTES,thanitwouldSectionIChapter55-5TargetedExtendedSearchU.S.CensusBureau,Census2000 Table5-4.EffectofTESattheNationalLevelWithTESWithoutTESDifference*EffectofTES(1)(2)(1)-(2)(1)/(2)EsamplePersons(N e).........................................264,578,862264,634,794(55,932)**1.000CorrectEnumerations(CE)
............................252,096,238244,387,9517,708,2881.032CERate(%).
........................................95.392.32.91.032PsamplePersons(N p).........................................263,037,259262,906,916130,343**1.000 Matches.............................................240,878,622230,681,20510,197,4181.044MatchRate(%).
.....................................91.687.73.81.044RatioofCEtoMatchRate
.................................1.0401.053(0.012)0.989CoefficientofVariationforRatio...
.........................0.1290.314(0.197)0.405*Percentageswerecalculatedonunroundedvalues.**TheweightedE-andP-samplesizesdifferedslightlybecauseofvarianceinTESsampling.Note:Tableabovereflectsnationaltotalswithoutregardtopost-stratificationanddiffersfromothertotalsinwhichpost-stratumtotalswere aggregated.havebeenwithout,andtheoverallmatchratewas3.8percenthigher(seeTable5-4).Thelargerincreaseinthe matchrate,ascomparedtothecorrectenumerationrate,occurredbecausethereweremoreidentifiedpotentialgeocodingerrorsinthePsamplethanintheEsample.Thedifferenceinthenumberofmatchesversuscorrectenumerations(10.2millionand7.7million,respectively)fromTEShadbeenasourceofconcern,sinceitsuggested thepossibilityofbalancingerror.BalancingerrorwouldhaveoccurredifthegeographicboundariesincludedinthePandEsampleshadnotbeenconsistent.Forinstance, supposethePsamplewasallowedtomatchtocensusper-sonsinhousingunitsbeyondthefirstring,whileacensusunitcouldonlybeclassifiedcorrectifitwaswithinthefirstring.AdamsandLiu(2001)performedanevaluation studyoftheP-samplehousingunitsinA.C.E.andcon-cludedthatthemainsourceofthemeasuredimbalancewasgeocodingerrorinthePsample.Table5-4showsthatTESincreasedthenumberofcorrectenumerationsfrom244.4millionto252.1millionandmatchesfrom230.7millionto240.9million.BeforeTES, therehadbeen20.2millionerroneousenumerations,ofwhich7.7millionweregeocodingerrorsthatwereclassi-fiedascorrectenumerationsbyTES.TESalsoallowed 10.2millionadditionalP-samplematchestooccuroutof 32.2millionoriginalnonmatches.Improvingboththematchandcorrectenumerationratethismuchsignifi-cantlyimprovesthevarianceoftheDSE,sinceover90per-centofpeoplematchorarecorrectlyenumerated.Table5-5showsthesignificantcontributionthatTESmakestovariancereduction.FortheA.C.E.consideredas awhole(i.e.adirectDSEoftheentirepopulationwithoutpost-stratification),thecoefficientofvariationis0.129percentwithTESand0.314percentwithoutTES.Table5-5.EffectofTESonCoefficientofVariation(CV)
Standard error CV (percent)WithTES.............................355,4510.129WithoutTES
..........................877,6640.314Note:Tableabovereflectsnationaltotalswithoutregardtopost-stratificationanddiffersfromothertotalsinwhichpost-stratumtotals wereaggregated.Atthepost-stratumlevel,theaverageimprovementintheDSEstandarderrorisabout33percent.Thegainsinpreci-sionasmeasuredbyvarianceshowthatTESmakesdual systemestimatesmoreprecise,andthatTESimprovesthe qualityoftheA.C.E.,solongasitdoesnotmaketheDSEs lessaccuratebyintroducingbias.Thecoefficientofvaria-tionwasreducedforamajorityofthecollapsedpost-strata(448originalpost-stratawerecollapsedinto416 post-strataforDSEcalculationpurposes).Table5-6.EffectofTESonPost-StratumCVs
[Percent]WithTESWithoutTESAverageCV..........................2.072.66MedianCV...........................1.812.32AverageCVweightedbycensuscount..1.301.935-6SectionIChapter5TargetedExtendedSearchU.S.CensusBureau,Census2000 Chapter6.MissingDataProcedures INTRODUCTIONThischaptergivesanoverviewofmissingdataproceduresfortheCensus2000AccuracyandCoverageEvaluation(A.C.E.).Generalbackgroundinformationispresented first,whilethefollowingsectionsdescribethreetypesofproceduresusedtoaccountfordatamissingintheA.C.E.Thenoninterviewadjustmentaccountsforwhole-householdnonresponse.Thenextsectiondescribesthecharacteristicimputationusedtoassignvaluesforspecificmissingdemographicvariables.Finally,forpersonswith unresolvedmatch,residence,orenumerationstatus,aprobabilityofmatching,residence,orcorrectenumerationwasassignedaccordingtoprocedures.AsmissingdataintheA.C.E.wereaddressedafterthecompletionofthefieldoperationsthatproducedtheA.C.E.datafiles,aknowledgeofthefieldactivitiesandthe circumstancesthatledtospecificoutcomesisnecessary tounderstandthemotivationfortheseprocedures.Forthisinformation,thereaderisreferredtoChapter4fordetailsonthefieldoperations.ThemissingdataproceduresusedintheCensus2000A.C.E.weresimilartothoseusedontheIntegratedCover-ageMeasurement(ICM)sampleintheCensus2000Dress Rehearsal.AnoutlineoftheICMproceduresandasum-maryofrelatedresearcharegiveninIkeda,Kearney,andPetroni(1998).KearneyandIkeda(1999)provideanover-viewoftheresultsfromtheDressRehearsal.Fordetailedmissingdataproceduresforthe2000A.C.E.,seeCantwell(2000)andIkedaandMcGrath(2001).Afewbasicresults onmissingdatafrom2000arefoundinthischapter; manymoreresultscanbefoundinCantwelletal.(2001).
BACKGROUNDBeforedualsystemestimateswerecalculated,itwasnec-essarytoaccountformissinginformationfromtheinter-viewsofP-samplepeopleandfromthematchingopera-tions.Itshouldbenotedthatthetermmissingdataappliesafterallfollow-upattemptshavebeenmade.Chapter4describessomeoftheextensivefieldproce-duresconductedtominimizetheresultinglevelofsuchmissingdata.Theseactivities-allspecifiedinadvance-includedmultipleattemptsatinterviews,the useofhighlytrainedclerksandtechnicianstoresolve cases,andthefollowupofcaseswhereasecondinter-viewcouldprovideadditionalrequiredinformation.ThereweretwomaintypesofmissingdataintheA.C.E.andthreeprocessesusedtocorrectforthem.Thefirsttypewasunitmissingdata.ThesewerehouseholdsthatwerenotinterviewedintheA.C.E.eitherbecausetheycouldnotbecontactedorbecausetheinterviewwas refused.Thenoninterviewadjustmentprocessspreadtheweightsofthesehouseholdsamonghouseholdsthatwereinterviewedinthesamenoninterviewcell.Theothertypeofmissingdatawasitemmissingdata.Thissituationoccurredwhensomeinformationforahouseholdorpersonwasavailablebutportionsofthe dataweremissing.Twogroupsofmissingdataitemshadtobeaddressed:demographicitemsanditemsrelatingtoaspecificoperationalstatus.Missingage,sex,tenure, race,andHispanicoriginwereimputedtoallowthepro-ductionofestimatesofthecensusundercountbythesecharacteristics,andbecausetheywerenecessarytoassign peopletopost-strata.ForasmallnumberofpeopleinthePsample,therewasnotenoughinformationavailabletodeterminethematch status(whetherornotthepersonmatchedtosomeonein thecensusintheappropriatesearcharea)ortheresidencestatus(whetherornotthepersonwaslivingintheblockclusteronCensusDay).Determiningresidencestatuswas importantforthePsamplebecauseCensusDayresidentsoftheblockclustersinthesamplewereusedtoestimatetheproportionofthepopulationwhowerenotcountedin thecensus.Similarly,somepeopleintheEsamplelacked informationtodeterminewhetherthepersonwascor-rectlyenumerated.Suchcaseswherestatuscouldnotbedeterminedweresaidtobeunresolved.Generallyfor caseswithmissingstatusaprobabilityofresidence,match,orcorrectenumerationwasassignedbasedoninformationavailableaboutthespecificcaseandabout caseswithsimilarcharacteristics.Inthe1990Post-EnumerationSurvey,ahierarchicallogis-ticregressionprogramwasusedtocalculateprobabilities ofmatchandcorrectenumerationforcaseswithmissinginformation.(Duetotheprocedureusedtotreatmoversin1990,residencestatusplayedadifferentrolethen.)The modelandsomeresultsarediscussedinBelinetal.
(1993).Duringcensustestsin1995and1996,certaincomponentsofmissingdatawereaddressedusinglogisticregression,whileforothercomponentsasimplerproce-durecalledimputationcellestimationwasused.ThelatterprocedurewasusedexclusivelyintheCensus2000DressRehearsalin1998.Datafromthesetestsindicatethatthe exactmethodofcalculatingprobabilitiesforunresolved status(match,residence,orcorrectenumeration)hasaSectionIChapter66-1MissingDataProceduresU.S.CensusBureau,Census2000 minoreffectonthedualsystemestimates.MoredetailsofthisresearchcanbefoundinIkeda(1997,1998,1998b, and1998c)andCantwell(1999).Basedonthesefindings andconcernsaboutimplementinglogisticregressionina productionenvironment,thesimplerprocedure(thatis, imputationcellestimation)wasusedtoestimatemissing dataitemsintheA.C.E.NoninterviewAdjustment(HouseholdLevel)AtthetimeoftheComputerAssistedPersonalInterview(CAPI),questionswereaskedtodeterminewholivedinthehouseholdonInterviewDayandwholivedthereonCensusDay,andamoverstatuswasassignedbasedon thereplies.Thustworosterswerecreatedforeachhouse-hold-theCensusDayrosterandtheInterviewDayroster.TheA.C.E.usedinmoverstoestimatethenumberof P-samplemoversinthepost-stratum,whileusingoutmov-erstoestimatethematchrateofthemovers.ThismethodisreferredtoasMoverProcedureCorPES-Cintheresearchstudies.SeeChapter4fordescriptionsofthe termsnonmover,inmover,andoutmover.AllinmoversandallnonmoversweregenerallyassumedtobeA.C.E.InterviewDayresidents,withtheexceptionof infantsbornafterCensusDay.Peoplelivingingroupquar-ters,suchascollegestudentsindormitories,werenoteli-gibleforthePsample.Therefore,forthepurposeofesti-matingthenumberofinmovers,personinmoversaged18to22whowerelivingingroupquartersonCensusDaywerenotconsideredtobeInterviewDayresidents.NoninterviewadjustmentwasperformedonlyonthePsample.Theprocedurewassimilartothatusedinthe CensusDressRehearsal.Duetothemoverproceduredescribedabove,thereweretwononinterviewadjust-ments-onebasedonhousingunitstatusasofCensus Day(i.e.,theCensusDayroster),andtheotherbasedon housingunitstatusasofthedayoftheA.C.E.interview(i.e.,theInterviewDayroster).Anoccupiedhousingunitwasdefinedasaninterview(forthegivenreferenceday-CensusDayorInterviewDay)iftherewasatleastoneper-son(withanameandatleasttwodemographiccharacter-istics)whopossiblyordefinitelywasaresidentofthe housingunitonthegivenreferenceday.Anoccupiedhousingunit(asofthegivenreferenceday)thatwasnotaninterviewwasanoninterview.Thusaunitthatwas vacant,removedfromthelistofeligiblehousingunits (because,forexample,itwasdemolishedorusedonlyasabusiness),orincertainspecialplaceswasnotconsid-eredanintervieworanoninterview.Inthelattertwositua-tions,theunitwasdeletedfromthelistofA.C.E.samplehousingunits.IfahousingunitwasfoundtobevacantonCensusDayordeletedfromthesample,thenthathouseholddidnotfac-torintotheCensusDaynoninterviewadjustment.The sameconceptappliestoInterviewDay.Thus,vacantanddeletedunitsdidnotcontributetowarddualsystemesti-mation.Anexampleofanillustrativeblockcluster,pro-videdinFigure6-1,page6-10,showshowthestatusofa housingunitonCensusDayandInterviewDaywouldbe determined.ResultsoftheA.C.E.interviewingoperation areshowninTable6-1.Table6-1.StatusofHouseholdInterviewsintheA.C.E.[Unweighted]CensusDay A.C.E.InterviewDayNumberPercentNumberPercentTotalhousingunits
.........300,913100.0300,913100.0 Interviews
...............254,17584.5264,10387.8 Noninterviews
............7,7942.63,0521.0Vacantunits
.............28,4729.529,6629.9Deletedunits
............10,4723.54,0961.4Noninterviewrate
...........3.0%1.1%Note:Percentagesintablemaynotaddtototalduetorounding.Ofthe261,969housingunitsoccupiedonCensusDay,7,794(3.0percent)werenoninterviews.Thecorrespond-ingnumbersforInterviewDaywere267,155and3,052 (1.1percent).ThenoninterviewratewashigherforCen-susDaythanInterviewDay,becauseinterviewstatuswasdeterminedbyresultsobtainedonInterviewDay.Onthat date,informationwassoughtforbothCensusDayand InterviewDay.Anytimeahouseholdmemberorknowl-edgeableproxycouldbereached,aninterviewforInter-viewDaywasgenerallyobtained.CensusDaydatawas notalwaysobtainablefromthesamerespondent,usuallyincaseswhenthehousingunitsoccupantshadmovedinafterCensusDay.Eachofthetwononinterviewadjust-mentsgenerallyspreadtheweightsofnoninterviewedunitsoverinterviewedunitsinthesamenoninterviewcell,definedasthesampleblockclustercrossedwiththetype ofbasicaddress.Forpurposesofthisadjustment,the typesofbasicaddressweresingle-family,multiunit(suchasapartments),andallothers.TheCensusDaynoninter-viewadjustment,determinedaccordingtothestatusof housingunitsasofCensusDay,wasusedtoadjustthepersonweightsofnonmoversandoutmovers.Similarly,theInterviewDaynoninterviewadjustment,determined accordingtothestatusofhousingunitsasofInterviewDay,wasusedtoadjustthepersonweightsofinmovers.Theformulaearedescribedasfollows:Foragivenblockclusterandtypeofbasicaddress,theCensusDaynoninterviewadjustmentfactorwascom-putedas f*cw iC ensus D ay interviewsw i C ensus D ay noninterviewsw i C ensus D ay interviews6-2SectionIChapter6MissingDataProceduresU.S.CensusBureau,Census2000 wherew irepresentstheweightofhousinguniti,thatis,theinverseofitsprobabilityofselectionintotheA.C.E.
sample.Whencomputingthenoninterviewadjustment factor,theweightw iincorporatedthetrimmingthatoccurredinsomeblockclusters.(SeeAppendixC.)How-ever,theweightsdidnotreflectthesamplingfortargeted extendedsearch(TES,Chapter5)fortworeasons.First,thenoninterviewadjustmentwasdoneatthehousingunitlevel,butahousingunitcouldcontainsomepeoplewith TESstatusandotherswithoutit.Second,TESstatuswasnotdetermineduntilafterthematchingoperation,butinformationwasusuallynotcollectedaboutpeopleinnon-interviewedunits,andthesepeopleweregenerallynotsenttobematched.Therefore,therewasnotareasonablewayofsystematicallyclassifyingnoninterviewsintothose withandwithoutTESstatus.Similarly,foragivenclusterandtypeofbasicaddress,theInterviewDaynoninterviewadjustmentfactorwascom-putedas f*iw iI nterview D ay interviewsw i I nterview D ay noninterviewsw i I nterview D ay interviewsTheexampleinFigure6-1onpage6-10,demonstratesthecalculationofthenoninterviewadjustment.Whenthe unweightednumberofnoninterviewedunitsinagivennoninterviewcell(sampleblockclusterbytypeofbasicaddresscategory)wasmorethantwicetheunweighted numberofinterviewedunits,thentheweightsofthenon-interviewedunitswerespreadovertheinterviewedunitsinabroadercell.ThiscellwasformedbycombiningthesampleblockclustersinthesameA.C.E.samplingstratum withinthesametypeofbasicaddress.Becausethenonin-terviewratesweresosmall,thenoninterviewadjustmentfactorswerecloseto1formosthousingunitsinthe sample.ForCensusDay,thefactorsweresmallerthan 1.10formorethan92percentoftheunits;forInterviewDay,thefactorswerelessthan1.10forover98percentoftheunits.Characteristic(Item)Imputation(PersonLevel)ProductionofA.C.E.undercountestimatesrequireddataonage,sex,tenure(ownerversusnonowner),race,and Hispanicorigintoclassifyrespondentsbytheseimportant demographiccharacteristics,sotheyhadtobeimputed wheneverthedatawerenotcollected.Characteristicimpu-tationwasnotcarriedoutforothermissingvariables (withtheexceptionoftheitemswithunresolvedstatus).
Severalvariablesalsousedtoassignpost-strata,suchas thelocationorreturnrateofthecensustract,werethe sameforeveryoneintheblock.Theextentofthemissing characteristicsisportrayedinTable6-2.TheimputationratesintheEsampleforthefivecharacter-isticslistedaboverangedfrom0.3percentforsexupto 3.8percentfortenure(usingunweightedfrequencies).SincetheA.C.E.recordforeachpersonintheEsamplewasmatchedtotheCensus2000editedfileandthefive characteristicswereextractedandcopied,thefollowingimputationproceduresapplyonlytothePsample.P-samplecharacteristicimputationfortheA.C.E.wassimi-lartothatforthe1990PESandthevariousCensus2000tests,includingtheDressRehearsal.Ageandsexwere imputedbasedontheavailabledemographicdistributions determinedfromthePsample.Tenurewasimputedusingaformofnearest-neighborhot-deckprocedure.ToimputeforraceandHispanicorigin,thetwoapproacheswere
combined.Formissingtenure,race,andHispanicorigin,ahot-deckprocedurewasusedtotakeadvantageofthecorrelationsoftenfoundinthesecharacteristicsamongpeoplelivinginthesameblockcluster(or,generally,ingeographicprox-imity).Thecharacteristicsageandsexaregeographicallylessclusteredthantenure,race,andHispanicorigin.Fur-ther,thevalueofageorsexisoftenconsiderablyaffected byspecificconditions,suchasthepersonsrelationshipto thereferenceperson,orwhetherinformationisavailableonthepersonsspouse.Thus,nationaldistributionscondi-tionedonrelevantcovariateswereusedtoimputeforage andsex.Thesedistributionswereconstructedbeforetheimputationbegan,withoutregardtotheimputationforothermissingcharacteristics.Table6-2.PercentofCharacteristicImputationinthePandE Samples[Unweighted]Total peoplePercentofpeoplewithimputed characteristicPercentofpeoplewithoneormore imputed characteristicsAgeSexTenureRace His-panic originPsample..............706,2452.51.71.91.42.45.5Esample..............704,6023.10.33.83.53.611.2SectionIChapter66-3MissingDataProceduresU.S.CensusBureau,Census2000 Age.Thevalueofagewasmissingfor2.5percent(unweighted)ofthePsample.Whenagewasmissing,one offouragecategories(0-17,18-29,30-49and50or older)-ratherthananumber-wasimputed,because onlythecategorywasusedtoassignpeopletoapost-stratumforestimation.Inone-personhouseholds,missing agewasimputedfromthedistributionofagesreportedin suchhouseholds.Inmultipersonhouseholds,iftherela-tionshiptothereferencepersonwasmissing,thedistribu-tionofages(excludingthoseofreferencepersons)inall multipersonhouseholdswasused.Otherwise,iftheper-sonwasthespouse,child,sibling,orparentoftherefer-enceperson,missingagewasgenerallyimputedfroma distributionofreportedagesusingtherelationshiptothe referencepersonandtheageofthereferenceperson.For referencepersons,otherrelatives,andnonrelatives,age wasimputedfromthedistributionofagesreportedby personswiththesamerelationship.SeeFigure6-2,on page6-11,fordetails.
Sex.Theimputationrateforsexwas1.7percentinthePsample.Forone-personhouseholds,sexwasimputedfromthedistributionofsexinallone-personhouseholds.
Toimputethesexofareferenceperson,ifthehouseholdhadmorethanonepersonbutnospousewaspresent,thedistributionofsexforreferencepersonsofmultiperson householdswithnospousepresentwasused.Ifaspousewaspresent,themissingsexofthereferencepersonorthereferencepersonsspousewasimputedasthesex oppositetothatofthespouse.Ifsexwasmissingforthe referencepersonandthespouse,thenthesexoftherefer-encepersonwasimputedfromthedistributionofsexforreferencepersonswithaspousepresent.Thespousewas thenassignedthesexoppositetothatofthereference person.Forotherpersonsinmultipersonhouseholds(thatis,otherthanreferencepersonsandspouses):1)iftherela-tionshiptothereferencepersonwasmissing,andifno oneelseinthehouseholdwasrecordedasaspouseofthereferenceperson,sexwasimputedfromthedistributionofsexforpersons(excludingreferencepersons)fromallmultipersonhouseholds;2)otherwise,sexwasimputed fromthedistributionofsexforpersons(excludingrefer-encepersons,spouses,andpersonswithmissingrelation-ship)fromallmultipersonhouseholds.Figure6-3,onpage6-12,illustratestheprocedure.Tenure.Householdtenure(ownerversusnonowner)wasmissingfor1.9percentofthepeopleinthePsample.Ten-urewasimputedfromtheprevioushouseholdthathadthesametypeofbasicaddressandhadtenurerecorded.Aswiththeadjustmentfornoninterviews,threetypesof basicaddresswereused:single-family,multiunit,andall othertypesofunits.SeeFigure6-4,onpage6-13,forfur-therinformation.
Race.Whenracewasmissing-1.4percentofthePsample-theimputedracecouldbeanyofthe63possible combinationsofthesixbasicracecategories:White,Black,AmericanIndianorAlaskanNative,Asian,NativeHawaiianorOtherPacificIslander,andSomeOtherRace.
All63categoriesweretreatedthesameintheimputation.Thatis,therewerenospecialproceduresforanycatego-riesorgroupsofcategories.Wheneverpossible,missingracewasimputedfromthesamehousehold.Independentlyforeachhouseholdmem-berwithmissingrace,onepersonwasselectedatrandomfromthosehouseholdmemberswithreportedraceandtheselectedpersonsracewasimputedtothegiven householdmember.Ifracewasmissingforallhouseholdmembersbutsomeonehadreportedorigin(Hispanicornon-Hispanic),thentheracedistributionofthenearest previoushouseholdwithanyreportedraceandthesame originwasused.NotethattheHispanicoriginofthehouseholdwasthatofthefirstpersononthehouseholdrosterwithoriginreported.WhenraceandHispanicorigin weremissingforthewholehousehold,theracedistribu-tionofthenearestprevioushouseholdwithreportedracewasusedregardlessofHispanicorigin.SeeFigure6-5, onpage6-14,fordetails.HispanicOrigin.Avalueoforigin-Hispanicornon-Hispanic-wasimputedfor2.4percent(unweighted)of thePsample.Theprocedurewasanalogoustothatfor imputingmissingrace.Thatis,wheneverpossible,originwasimputedfromwithinthesamehousehold.Ifeveryoneinthehouseholdwasmissingorigin,thenthenearestpre-vioushouseholdwithreportedoriginandthesameracecategorywasused.WhenbothHispanicoriginandraceweremissingforthewholehousehold,theHispanicorigin distributionofthenearestprevioushouseholdwithreportedHispanicoriginwasused-regardlessofrace.Fortheimputationprocedureandtheracecategoriesusedin it,seeFigure6-6,onpage6-15.Foreachofthefivecharacteristicsdiscussed,thedistribu-tionofimputedvaluesdidnotnecessarilymirrorthedis-tributionofreportedvalues-norwasthisexpected.How-ever,becausetheimputationrateswerelowinthePandEsamples,thedistributionsbeforeandafterimputation wereverysimilar.Seethedistributionofcharacteristicsonthefollowingpage.6-4SectionIChapter6MissingDataProceduresU.S.CensusBureau,Census2000 DistributionofCharacteristicsBeforeandAfterImputation[Weighted]PsampleEsample BeforeimputesImputed After imputes BeforeimputesImputed After imputesRace1.4%Imputed3.2%ImputedWhiteOnly73.5%67.5%73.4%76.9%57.2%76.2%BlackOnly11.0%10.2%11.0%11.8%6.6%11.6%
AIANOnly0.6%0.7%0.6%0.8%0.8%0.8%
AsianOnly3.5%3.4%3.5%3.7%2.9%3.7%
NHPIonly0.1%0.3%0.1%0.1%0.3%0.1%
Someotherraceonly8.3%14.4%8.4%4.5%28.5%5.3%
Multipleraces3.0%3.5%3.0%2.3%3.7%2.3%Hispanicorigin2.3%Imputed3.4%ImputedHispanic12.4%11.5%12.4%12.5%9.0%12.4%Age2.4%Imputed2.9%Imputed0-1726.1%21.7%26.0%25.9%19.7%25.7%18-2916.7%18.9%16.7%15.5%19.0%15.6%
30-4930.7%33.0%30.8%31.0%30.9%31.0%
50+26.5%26.4%26.5%27.6%30.5%27.6%Sex1.7%Imputed0.2%ImputedMale48.4%47.2%48.3%48.8%53.9%48.8%Female51.6%52.8%51.7%51.2%46.1%51.2%Tenure1.9%Imputed3.6%ImputedOwner68.4%70.3%68.4%69.9%65.1%69.7%Nonowner31.6%29.7%31.6%30.1%34.9%30.3%AssigningProbabilitiesforUnresolvedCases(PersonLevel)Afterallfollow-upactivitieswerecompleted,thereremainedasmallfractionoftheA.C.E.samplewithout enoughinformationtocomputethecomponentsofthedualsystemestimatorgiveninChapter7.Theirstatus wassaidtobeunresolved.Aprocedurecalledimputa-tioncellestimationwasusedtoassignprobabilitiesforP-samplepeoplewithunresolvedmatchorCensusDay residencestatus,andforE-samplepeoplewithunresolved enumerationstatus.AllP-andE-samplepersons-resolvedandunresolved-wereplacedintogroupscalledimputationcellsbasedonoperationalanddemographiccharacteristics.DifferentvariableswereusedtodefinecellsforP-samplematch andresidencestatusandintheE-sampleforenumeration status.Withineachimputationcelltheweightedaverageof1sand0s(representing,e.g.,matchandnonmatch,respectively)amongtheresolvedcaseswascalculated, andthataveragewasimputedforallunresolvedpersonsinthecell.Oneshouldnotethatthenoninterviewadjustmentfactorwasnotincorporatedintothepersonweightswhentheseaverageswerecalculated.Thisisbecausethenoninter-viewadjustmentwasdesignedtospreadtheweightof noninterviewedhousingunitsoverinterviewedhousingunits.However,allpersonswithresolvedresidencestatusinnoninterviewedunitswerenonresidents(since,bydefi-nition,ifonepersoninthehouseholdwasaresidentthen thehouseholdwasconsideredaninterview).Therefore, usingthenoninterviewfactortocalculatetheaveragesfor unresolvedcaseswouldhaveproducedabiasedestimate ofresidenceprobability.Theissueofwhichweightstouse wasmootwhenresolvingE-samplecaseswithmissing enumerationstatus,asanoninterviewadjustmentwasnot appliedtoE-samplepersons.Thus,theweights,w i,usedhereincorporatedallstagesofsampling,includingtheselectionofpeoplefortargeted extendedsearch,butwerenotadjustedforhousehold noninterviews.Anytrimmingoftheweightswasalsoper-formedbeforetheseweightedaverageswerecalculated.UnresolvedResidenceStatusinthePSampleAfterfollow-upwascompleted,allpersonsinthePsamplewhowereeligibletobematchedtotheCensus(seeChap-ter4)wereclassifiedintothreetypes,accordingtotheir statusasaresidentintheirsampledblockatthetimeof thecensus:CensusDayresidents,CensusDaynonresi-dents,andunresolvedpersons-thoseforwhomthere wasnotenoughinformationtodeterminetheresidence status.TheresultsaredisplayedinTable6-3.SectionIChapter66-5MissingDataProceduresU.S.CensusBureau,Census2000 Table6-3.FinalResidenceStatusforthePSamplebyMoverStatus
[Unweighted]Total peopleFinalresidencestatus Residenceratefor resolved cases Confirmed resident Confirmed nonresident Unresolved residentU.S.total..........................653,33795.8%1.9%2.3%98.1%Moverstatus Nonmover.......................627,99296.6%1.7%1.7%98.3%
Outmover........................25,34575.2%7.5%17.4%91.0%Becauseoftheuncertaintyoftheactualstatusofthe15,082people(2.3percentof653,337)withunresolvedresidencestatus,aprobabilityofbeingaCensusDay residentwasassigned(seeequation(6.2)).Then,whencomputingthedualsystemestimate,allperson nonmoversandoutmoverswereincludedwiththeiresti-mationweight(seeChapter7)andthefollowingresidenceprobability:
(6.1)P r res , j{1,ifpersonjisaresidentonCensusDay 0,ifpersonjisNOTaresidentonCensusDay P r*res , j ,ifpersonjisunresolvedToassignPr*res,jforunresolvedcases,theCensusDayresidenceprobabilityforinmoverswasirrelevantforesti-mationandwasnotused.OnlynonmoversandoutmoversinthePsamplewhohadaresolvedfinalresidencestatusandwentthroughthepersonmatchingoperation(for-mally,thosewithafinalmatch-codestatus)wereused.
TheywereplacedintoanumberofimputationcellsasdefinedinTable6-4.Withineachcell,amongtheresolvedcases(thosewithPrres,j=1or0)theweightedproportionofCensusDayresidents,thatis,theweightedaverageof1sand0s,wascomputed:
P r*res , jw i P r res , j resolved personsw i resolved persons (6.2)wherew iwasdefinedatthebeginningofthissection.ThisproportionwasthenassignedasPr*res,jtoeachunre-solvedcaseinthecell.(Theexceptionisforfollow-up matchcodegroup7;thisisexplainedbelow.)Thecells usedtoresolveresidencestatus,alongwiththeprobabili-tiesassignedtotheunresolvedcases,aregiveninTable6-4.Matchcodegroups1through7,whichpartitionthepopu-lationintomutuallyexclusiveandexhaustivegroups,weredeterminedfromthematchcodesandothervari-ablesderivedbeforethefollow-upoperationasexplainedinChapter4.Group8wasformeddifferently.Someinfor-mationfromthefollow-upoperationwascodedintimeforTable6-4.ImputationCellsandProbabilitiesAssignedforResolvingResidenceStatusinthePSampleMatchcodegroupOwnerNonownerNon-HispanicWhiteOthersNon-HispanicWhiteOthers1=Matchesneedingfollow-up
...................0.9820.9860.9930.9912=Possiblematches
............................0.9730.9680.9660.9723=Partialhouseholdnonmatchesneeding follow-up...................................
V3a*0.755 V3b*0.956 V3a*0.901 V3b*0.971 V3a*0.883 V3b*0.959 V3a*0.928 V3b*0.9694=Wholehouseholdnonmatchesneedingfollow-up,notconflictinghouseholds
..........0.9200.9430.9110.9145=Nonmatchesfromconflictinghousehold
........0.9100.9270.9450.9546=Resolvedbeforefollow-up
....................0.9930.9900.9900.9887=Insufficientinformationformatching(Weightedcolumnaverageofgroups1-5and8)
..........0.8130.8670.8440.8728=Potentiallyfictitiousorsaidtobelivingelse-whereonCensusDay
.......................0.1190.1230.1770.157*V3a=Group3Personsage18-29listedaschildofreferenceperson;V3b=Allothergroup3persons.6-6SectionIChapter6MissingDataProceduresU.S.CensusBureau,Census2000 theA.C.E.missingdataprocedures.(Undertheoriginalschedule,thisinformationwouldhavebecomeavailable toolatetobeofuse.)Afterthefollow-upoperation,a smallnumberofpeopleinthePsamplewerecodedas beingpotentiallyfictitiousorsaidtobelivingelsewhere onCensusDay.SuchpeoplewereplacedinGroup8,even thoughtheyalsoqualifiedforoneoftheGroupsl1 through7.Thetwotenurecategorieswereownersandnonowners.Personswereplacedintooneoftworacecategories:non-HispanicWhiteandallothers.Peopleofmultipleraces(for example,apersonrespondingasWhiteandAsian)wereplacedinthelattergroup.V3wasavariabledefinedonlyformatchcodegroup3,partialhouseholdnonmatches.
V3acomprisedpersonsingroup3whowere18to29yearsofageandwerelistedontheA.C.E.householdros-terasachildofthereferenceperson.V3bincludedall otherpersonsingroup3.TheresidenceprobabilityforunresolvedP-samplepersonswascomputedasdescribedabove,exceptforthosein matchcodegroup7-peoplewithinsufficientinformationformatching.Withinthissetoffourcells(seeTable6-4),therewerealmostnoresolvedcasesfromwhichtoextract aprobabilityofbeingaCensusDayresident.Becauseof thelackofinformation-mostofthesecasesdidnotevenhaveavalidname-thesepeopledidnotgothroughthematchingoperationandwerenotsenttofollow-up.To adjustforthesecases,aweightedproportionofCensusDayresidents(1sand0s)wascomputedamongtheresolvedcasesineachofthefourcolumnsofTable6-4 usingmatchcodegroups1through5and8.Separately foreachofthefourtenurerace/ethnicityclasses,theover-allweightedprobabilityofbeingaresidentamongthosesenttofollow-up(groups1through5and8)wasassigned tothosewithinsufficientinformationformatching(group7).Leftoutofthiscomputationwerethosepeoplewhowereresolvedbeforefollow-up(group6).Observations fromtheCensus2000DressRehearsalindicatedthat,intermsoftheirdemographicandoperationalcharacteris-tics,peopleingroup7tendtobemorelikethosein groups1through5and8,thanlikethoseingroup6.IntheDressRehearsal,onlythreeweightedratioswerecalculatedforresidenceprobability:aratioforpersons senttofollow-up,aratioforpersonsnotneedingfollow-up,andanoverallratiousedforpersonswithinsufficient informationformatching.BasedonDressRehearsal results,KearneyandIkeda(1999)suggestcalculating separateratiosbymatchcodegroupandsplittingpersons fromconflictinghouseholdsintoaseparatematchcode group.ThelargerAccuracyandCoverageEvaluation samplesizeinCensus2000thanintheDressRehearsal madeitpossibletoseparatematchesneedingfollow-upfrompossiblematches.Additionalresearchanddiscussionsuggestedaddingadditionalvariableswithinmatchcode group.UnresolvedMatchStatusComputingthedualsystemestimatorrequiredmeasuringthetotalnumberofP-samplepeoplewhowerematchedtopersonsincludedinthecensus.(Separateestimateswereobtainedfornonmoversandoutmovers,butthatdoesnot affectwhatfollows.)Afterfollow-upactivitieswerecom-pleted,eachconfirmedorpossible(unresolved)CensusDayresidentinthePsamplewasdeterminedtobeamatch,anonmatch,orunresolved(thatis,personsfor whommatchstatuscouldnotbedetermined).Matchsta-tusofconfirmedCensusDaynonresidentswasnotusedintheestimation.AsisseeninTable6-5,unresolved matcheswereinfrequentinthePsample.Thetreatmentofunresolvedmatcheswassimilartothatforunresolvedresidencestatus.Foreachconfirmedor possibleCensusDayresidentjinthePsample,thevalue Prm,jwasassignedas1,0,orPr*m,j,inamanneranalo-goustoequation(6.1),accordingtowhetherthepersonwasamatch,anonmatch,orhadunresolvedmatchsta-tus,respectively.Unresolvedmatchesaccountedfor7,826of640,945peopleinthePsample,or1.2percent.Pr*m,jwasassignedusingimputationcellestimationbasedonthosewitharesolvedmatchstatus.Theformulaisthesameasinequation(6.2),butpertainstomatchstatus,thatis,usesthevaluesofPrm,j.Table6-5.FinalMatchStatusforthePSamplebyMoverStatus
[Unweighted]Psample(confirmedorpossibleresidents)Numberof personsFinalmatchstatusMatchrate for resolved casesMatchNonmatch Unresolved matchU.S.total.........................................640,94590.3%8.5%1.2%91.4%Moverstatus
......................................
Nonmover......................................617,49091.1%8.0%0.9%91.9%
Outmover......................................23,45567.8%21.7%10.5%75.8%SectionIChapter66-7MissingDataProceduresU.S.CensusBureau,Census2000 Aswithresidencestatus,thecaseswerefirstclassifiedaccordingtoseveralcharacteristics.Withincells,the weightedproportionofmatchesamongtheresolvedcases
-excludingallconfirmedCensusDaynonresidents-was computedandassignedtoeachoftheunresolvedcasesin thesamecell.Again,theweights,w i,aredefinedearlier.Thecharacteristicsusedtodefinetheimputationcellsformatchstatus-differentfromthoseusedforresidencesta-tus-areshowninTable6-6.Theywerebasedonobserva-tionsfromtheCensus2000DressRehearsalandananaly-sisoftheA.C.E.operations.KearneyandIkeda(1999) showedthatmoverstatus(nonmoverversusoutmover)discriminatedwellbetweenmatchesandnonmatchesamongtheresolvedcases.Thehousingunitaddress matchcodereferstotheinitialmatchbetweenhousingunitsontheindependent(A.C.E.)listingandthecensusaddresslist;conflictinghousingunitsweredetermined duringA.C.E.personmatchingactivities.Peoplewithatleastoneimputeddemographicvariable(i.e.,age,sex,race,Hispanicorigin,ortenure)were groupedtogetherforimputationofmatchstatus.Unpub-lishedstudiesindicatethat,atleastintheDressRehearsal,thepresenceoftheseimputedcharacteristicsamong resolvedcasesisnegativelyassociatedwiththepropen-sitytobeamatch.Foroutmoversfromaunitthatwasanonmatchoraconflictinghousehold,peoplewerenotseparatedaccordingtotheirimputedcharacteristics.The reasonwastomaintainareasonablenumberofresolvedcasesineachcellfromwhichtoestimatetheweightedproportionofmatches.Theprobabilitiesassignedto peoplewithunresolvedmatchstatusareprovidedinTable6-6.Itisusefultonotethatmostpersonswithunresolvedmatchstatus(7,693ofthe7,826)hadinsuffi-cientinformationformatching;mostofthemdidnothave avalidname,andtheirrateofmissingcharacteristicswas muchhigherthantheaverage.Further,almostallofthese people(7,506)wereinmatchcodegroup7.Assuch,they didnotgothroughthematchingprocess,norwerethey sentforfollow-up.Thisinformationwasconsideredwhen cellswereselectedforimputationofmatchstatus.Vari-ablessuchasageandethnicity-thathadahighchance ofbeingimputedandmightbeofquestionablequality-wereavoided.IntheDressRehearsal,withineachofthefourgeographicsites,oneoverallweightedratioformatchprobabilitywascalculatedandused.KearneyandIkeda(1999)suggestthatseparateratiosforoutmoversandnonmoversshould becalculated.UnresolvedEnumerationStatus(ESample)ThedualsystemestimatoralsorequiredthetotalnumberofcorrectenumerationsintheEsample.Aswithopera-tionspreviouslydiscussed,follow-upactivitieslefteach personintheEsamplewithoneofthreetypesofenu-merationstatus:correct,erroneous,orunresolved.Thepersonwasassignedanumber,Prce,j,equalto1,0,or Pr*ce,j,respectively,accordingtothatstatus,similartoequation(6.1).Table6-7showsthedistributionofpersonsaccordingtoenumerationstatus.ThevaluesofPr*
ce,j forthe21,148unresolvedE-samplepeople(3.0percentof704,602)weredeterminedthroughimputationcell
estimation.Table6-6.ImputationCellsandProbabilitiesAssignedforResolvingMatchStatusinthePSampleMoverstatusHousingUnitAddressMatchCodeHousingunitwasamatch(code1)Housingunitwasanonmatchorthehouseholdisconflicting(code2or4)Noimputes1ormoreimputesNoimputes1ormoreimputesNonmover0.9450.9010.6900.567Outmover0.7980.7910.516Table6-7.FinalEnumerationStatusfortheESample
[Unweighted]EsampleNumberof personsFinalenumerationstatus Correct enumerationratefor resolved cases Correct enumera-tion Erroneous enumera-tion Unresolved enumera-tionU.S.total..........................704,60292.6%4.4%3.0%95.5%6-8SectionIChapter6MissingDataProceduresU.S.CensusBureau,Census2000 TheresolvedandunresolvedcaseswereplacedinthecellsdefinedshowninTable6-8.Withineachcell,the weightedproportionofcorrectenumerationsamong resolvedcaseswascomputedbeforeaccountingfor duplicationwithnon-E-samplepeople,analogoustoequa-tion(6.2),andthenassignedtoeachunresolvedcasein thecell.AswithresidencestatusforP-samplepeople,akeyfactorindeterminingenumerationstatuswastheE-sampleper-sonsmatchcode.ThesecodescanbefoundinChapter4.Peoplewereplacedinmatchcodegroupsaccordinglyin thefollowingsequence:1)Peoplecodedaspotentially fictitiousorsaidtobelivingelsewhereonCensusDay(basedoninformationcollectedduringthefollow-upoperation)wereplacedingroups11and12,respectively.
2)Allotherpeopleincludedintheoperationfortargetedextendedsearchwereplacedingroup10.SeeChapter5fordetails.3)PeopleintheremainderoftheEsample werethenplacedintheappropriatematchcodegroup,asdefinedinTable6-8.Othercharacteristicsusedtodefinecellswerethepresenceorabsenceofimputedcharacteris-tics(aswasusedtodefinecellsformatchstatus);whether thepersonwasnon-HispanicWhiteoranyotherrace-ethnicitycombination;andV3,asdefinedinthesection onresidencestatus.Therewasanadditionaladjustmentmadetotheenumera-tionprobabilityofE-samplepeopleasaresultofduplica-tionwithpersonssubsampledoutoftheE-sampleinlargeclusters.Ifthesameidentitywasassignedto u E-samplepersonsand vpersonswhoweresubsampledoutoftheEsample,1)oneofthe uE-samplepersonswasselectedduringthepersonmatchingoperation,and2)theinitialcorrectenumerationprobabilitywasmultipliedbyu/(u+v)duringthemissingdataactivities,asitwasnotknownwhichpersonwastheactualE-sampleperson.
Theother u-1E-samplepersonswereassignedacorrectenumerationprobabilityof0.Table6-8.ProbabilitiesAssignedforResolvingEnumerationStatusintheESampleMatchcodegroupNoimputedcharacteristics1ormoreimputedcharacteristics1=Matchesneedingfollow-up0.9770.9772=Possiblematches0.9680.9683=PartialhouseholdnonmatchesV3a*0.871V3b*0.974V3a*0.908V3b*0.9604=Wholehouseholdnonmatcheswherethehousingunitmatched;notconflictinghouseholds Non-Hispanic White 0.965 Others 0.974 0.9585=Nonmatchesfromconflictinghouseholds;forhousingunitsnotinregularnonresponsefollow-up0.9750.9656=Nonmatchesfromconflictinghouseholds;housingunitsinregularnonresponsefollow-up0.9140.9267=Wholehouseholdnonmatches,wherethehousingunitdidnotmatchinhousingunitmatching Non-Hispanic White 0.959 Others 0.947 0.9508=Resolvedbeforefollow-up Non-Hispanic White 0.995 Others 0.990 0.9799=Insufficientinformationformatching0.00010=Targetedextendedsearchpeople0.9280.858 11=Potentiallyfictitiouspeople0.0580.088 12=PeoplesaidtobelivingelsewhereonCensusDay0.2290.210*V3a=Group3Personsage18-29listedaschildofreferenceperson;V3b=Allothergroup3personsSectionIChapter66-9MissingDataProceduresU.S.CensusBureau,Census2000 Figure6-1.AdjustmentforNoninterviews:AnExampleConsiderablockclusterwithninehousingunits,allhavingthesametypeofbasicaddress,forexample,allsingle-familyhomes,asdepictedbelow.
Housing unitWeight Actual situationStatusof(andinformationfrom)A.C.E.InterviewCensusDay interview status A.C.E.InterviewDayinterviewstatus1100Residenton4/1/00andattimeofA.C.E.interviewInterviewedinA.C.E.InterviewInterview2100Residenton4/1/00andattimeofA.C.E.interviewNeighbor(proxy)interviewedinA.C.E.InterviewInterview3100Residenton4/1/00andattimeofA.C.E.interviewNooneinterviewedinA.C.E.NoninterviewNoninterview4100Vacanton4/1/00,residentattimeofA.C.E.interviewInterviewedinA.C.E.,knowsof4/1/00statusVacantInterview5100Vacanton4/1/00,residentattimeofA.C.E.interviewInterviewedinA.C.E.,noknowledgeof4/1/00
statusNoninterviewInterview6100Vacanton4/1/00,residentattimeofA.C.E.interviewNooneinterviewedinA.C.E.NoninterviewNoninterview7100Residenton4/1/00,vacantattimeofA.C.E.interviewInformationobtainedfrom proxyInterviewVacant8100Residenton4/1/00,vacantattimeofA.C.E.interviewNoinformationon4/1/00status;Censusstaffdeter-minesvacantattimeof
A.C.E.NoninterviewVacant9100Residenton4/1/00,differ-entresidentattimeofA.C.E.
interviewInterviewedinA.C.E.,knowsof4/1/00statusInterview InterviewNote:Inthisnoninterviewcell(sampleblockcluster xtypeofbasicaddress),peopleininterviewedhousingunitswouldhavereceivedthefollow-ingnoninterviewadjustments:a)Tothepersonweightsofnonmoversandoutmovers,CensusDayNoninterviewadjustment=800/400=2.b)Tothepersonweightsofinmovers,A.C.E.InterviewDayNoninterviewadjustment=700/500=1.4.6-10SectionIChapter6MissingDataProceduresU.S.CensusBureau,Census2000
<
<<<
55 5<5<
Chapter7.DualSystemEstimation INTRODUCTIONDualSystemEstimation(DSE)wasusedtoestimatecover-ageofCensus2000usingdatafromtheAccuracyandCoverageEvaluation(A.C.E.)Survey.DSEwasalsousedby theU.S.CensusBureautoestimatecensuscoverageforthe1980and1990censuses,andtoevaluatecoveragepriorto1980.TheuseofDSEformeasurementofcover-agein1980isdescribedinFayetal.(1988),whileHogan(1992,1993)describestheuseofDSEin1990.AsdescribedinKillion(1998),severalalternativestoDSE wereconsideredforCensus2000.ThesealternativeswereeithershowntoproduceresultsgrosslyinferiortoDSEorresearchwasnotconclusive.ThischapterprovidesthedetailsofDSEfortheCensus2000A.C.E.TheDSEwascalculatedseparatelyforasetof populationgroupsreferredtoaspost-strata.Thepost-stratificationvariablesandthefinalpost-stratificationplanarediscussedindetail.Inaddition,thevarianceestimation methodologyusedineachpost-stratumissummarizedandsomebasicresultsaregiven.DUALSYSTEMESTIMATIONThissectioncontainsthedetailsoftheDSEcalculatedwithineachfinalpost-stratum.ItdescribesthebasicDSE model,includingadiscussionoftheadvantageofpost-stratification.ThedetailsoftheDSEcomputedwithineachfinalpost-stratumforCensus2000arepresented.Allcom-ponentsoftheDSEaredefined.TheDSEaccountedfor specialhandlingofmissingdata,searchareasformatch-ing,andmovers.Missingdataandsearchareasformatch-ingarecoveredindetailinChapters6and5,respectively.
ThemethodusedtohandlespecialproblemscausedbymoversinCensus2000DSEisalsodiscussed.Theattach-mentprovidesdetailedbackgroundonoptionsfordealing withmoversincensuscoveragemeasurementsurveys.
ThesectionconcludeswithashortdiscussionofhowtheDSEresultsserveasinputtosyntheticestimationdowntotheblocklevel.Adetaileddiscussionofsyntheticestima-tionisprovidedinChapter8andHaines(2001).DSEModelTheDSEmodelisdiscussedindetailinWolter(1986)andmoregenerallyinHogan(1992).Thischaptergivesagen-eralpresentation.TheDSEmodel(appliedwithineachpost-stratum)conceptualizeseachpersonashavingaprobabilityofbeingeitherinornotinthecensusenu-meration,aswellaseitherinornotintheA.C.E.Table7-1.DSEModelIncensusOutofcensusTotalInA.C.E.N 11 N 12 N 1+OutofA.C.E.N 21 N 22 N 2+TotalN+1 N+2 N++AllcellsareconceptuallyobservableexceptN 22andanyofthemarginalcellsthatincludeN 22(i.e.,N 2+, N+2, and N++).Themodelassumesindependencebetweenthecen-susandtheA.C.E.Thismeansthattheprobabilityofbeing intheij thcell,p ij,istheproductofthemarginalprobabili-ties,p i+p+j.Theestimateoftotalpopulationinapost-stratumwiththeindependenceassumptionis DSENN1N 1N 11.Theindependenceassumptioncanbeinerror,eitherduetocausaldependencebetweenthecensusenumerationandtheA.C.E.enumeration,orduetoheterogeneityin captureprobabilitieswithinapost-stratum.Causaldepen-denceoccurswhentheeventofanindividualsinclusionorexclusionfromonesystemaffectshisorherprobability ofinclusionintheothersystem.Forexample,somepeoplewhodidanswerthecensusmaynothavecooper-atedwiththeA.C.E.,thinkingtheyhadhelpedenough.
Asanotherexample,apersoncontactedduringA.C.E.list-ingmaynothaverespondedtothecensusthinkingthattheA.C.E.listeralreadyrecordedthem.However,evenifcausalindependenceistrueforallindividuals (p ij=p i+p+j),theindependenceassumptioncanbevio-latedbyheterogeneity.Eitherthecensusinclusionprob-abilitiesp
+1ortheA.C.E.inclusionprobabilitiesp 1+mustbethesameforallindividuals.Thismeansthathomoge-neityinbothsystemsisnotrequired.Forexample,somepeoplemaytrytheirbesttoavoidbeingcountedinboththecensusandA.C.E.,resultinginthesepeoplehaving muchsmallerinclusionprobabilitiesthanotherpeople.Errorintheindependenceassumptionforeitherreasonresultsincorrelationbias.Post-stratification,orgroupingofindividualslikelytohavesimilarinclusionprobabilities,andcalculatingDSEswithinpost-stratawasdonetodecreasecorrelationbias.ResearchwascarriedouttodetermineeffectivevariablesSectionIChapter77-1DualSystemEstimationU.S.CensusBureau,Census2000 fortheA.C.E.post-stratificationdesign.Allvariablesincludedinthe1990PESpost-stratificationwereconsid-eredaswereseveralnewones.Thespecificvariablescon-sideredwererace/Hispanicorigin,age/sex,tenure,house-holdcomposition,relationship,urbanicity,percentowner, returnrate,percentminority,typeofenumerationarea, householdsize,hard-to-countscores,censusdivision, censusregion,andregionalcensuscenter.Fromthese variables,fifteenpost-stratificationoptionsweredevel-opedforempiricalresearch.Foreachpost-stratification option,meansquareerrorsoftotalpopulationestimates andsyntheticestimateswerecomputedatthenational, state,andcongressionaldistrictlevels,aswellasfor selectedcities.Themajorconclusionswereasfollows:*Thedemographicvariablesusedinthe1990PESwereeffective,butdidnotfullycapturethegeographicdiffer-ences,especiallythoseaffectedbythequalityofthe MasterAddressFile.Anurbanicity/typeofenumerationvariableappearedtocapturemuchofthegeographicdifferences.*Thetract-levelreturnratevariablecapturedsomeofthesocioeconomicdifferencesforsyntheticestimatesatlowerlevelsofaggregation.DetailsoftheCensus2000post-stratificationresearchmethodologyaregiveninKostanichetal.(1999)andGrif-fin(1999).ResultsofthisresearcharegiveninGriffinandHaines(2000)andSchindler(2000).Thepost-stratification designchosenforCensus2000isprovidedinthischapter.TheDSEcanbewrittenasfollows:
DSEN1 (N 1N 11)Thatis,thetotalpopulationisestimatedbythenumbercapturedinthecensustimestheratioofthosecapturedintheA.C.E.surveytothosecapturedinbothsystems.In practice,thecomponentsoftheDSEareestimatedfromasamplesurvey.N
+1isnotthecensuscount;thecensuscount(C)mustbecorrectedforerroneousenumerations, aswellasforpersonsenumeratedinthecensuswith insufficientinformationtomatchtotheA.C.E.enumera-tion.Toactuallyestimatethenumberofpeoplecorrectlyenumeratedinthecensus,asampleofalldata-defined personsisselected.Thissampleofdata-definedcensuspersonsiscalledtheenumerationorEsample.Toestimatetheratioofthosecapturedinbothsystemstothosecap-turedinA.C.E.,thepopulationorPsampleisused.ThePsampleconsistsofpersonsinterviewedduringA.C.E.enu-meration.TheformoftheDSEusedincensuscoveragemeasure-mentsurveyssuchasA.C.E.isasfollows:
DSEDDCE N eN p Mwhere DD=thenumberofcensusdata-definedpersonseligibleandavailableforA.C.E.matching, CEtheestimatednumberofcorrectenumerationsfromtheEsample, N etheestimatednumberofpeoplefromtheEsample, N ptheestimatedtotalpopulationfromthePsample, MtheestimatednumberofpersonsfromtheP-samplepopulationwhomatchtothecensus.Note:PersonsinGroupQuartersareexcludedfromalltheabovecountsforA.C.E.,aswerepersonsinhousingunits whowereaddedtothecensusafterEsampleIdentifica-tion(lateadds).
DefinitionsBlockCluster.Agroupingofoneormorecensusblocks.BlockclustersaretheprimarysamplingunitsforA.C.E.andaverageabout30housingunitseach.CorrectEnumeration(CE).Acorrectenumerationisapersonwhoisenumeratedinasampleblockclusterdur-ingthecensuswhoisalsodeterminedbyA.C.E.opera-tionstohavelivedinthatblockcluster(orifappropriateasurroundingblock)onCensusDay.Correctenumerations haveacorrectenumerationprobability,Pr ce,j,equalto1foreachpersonj.CorrectEnumerationProbability(Pr ce,j).ThisisdefinedastheprobabilitythatpersonjintheEsample wascorrectlyenumeratedintheA.C.E.(orsurroundingblock)blockcluster.Theprobabilityofcorrectenumera-tionistypically0or1,butitcantakeonvalueswithin thisrangeduetomissingdataimputation.CoverageCorrectionFactor(CCF).Thecoveragecor-rectionfactorforapost-stratumiscalculatedbydividing theDSEforthatpost-stratumbyitscensuscount.A.C.E.syntheticestimatesforanydataitemforanygeographicareaareobtainedbymultiplyingthecoveragecorrectionfactorbythecensuscountwithineachpost-stratum,then summingoverallpost-strata(seeChapter8fordetailson syntheticestimation).Data-DefinedPerson.Thisconceptisdefinedforallcensuspersons.Adata-definedpersonisapersonwho hastwoormoreofthe100-percentdataitemsansweredonthecensusform.Anyitemscanbeselectedfromthe100-percentdataitems,whichincludename,age,sex, race,andHispanicorigin.Relationshiptopersononeis alsoa100-percentdataitemforallpersonsbesidesper-sonone.Personsnotsatisfyingthiscriteriaarereferredtoasnon-data-defined.ESample.TheEsampleistheEnumerationsample.Itconsistsofalldata-definedpersonsintheA.C.E.blockclusterswhowereenumeratedinthecensus.7-2SectionIChapter7DualSystemEstimationU.S.CensusBureau,Census2000 GroupQuarters(GQ)Persons.PersonslivinginGQs,suchascollegedormitories,prisons,ormilitarybarracks.
GQpersonswerenotcoveredintheA.C.E.andare excludedfromtheA.C.E.universe.Inmover.ApersonwhomovedintoaP-samplehousingunitafterCensusDay.InsufficientInformationinCensus(II).ThosepersonsinthecensusforwhomthereisinsufficientinformationforinclusionintheEsample.Verylittledataisavailable forthesepersons.Thiscategoryincludesnon-data-definedpersonsandpersonsinwholehouseholdimputations.Notethatinsufficientinformationincensusisdifferent thaninsufficientinformationformatching.TheformerareexcludedfromtheEsampleandthelatterareincludedintheEsample.LateAdds.LateAddsarepersonsinhousingunitswhowereaddedtothecensusafterE-sampleIdentification.
ThesehousingunitshadanunknownfinalstatusatthetimeofA.C.E.matchingbutweresubsequentlyincludedinthecensus.PersonswhoareLateAddswereineligiblefor matchingand,therefore,notincludedinthecensusDSE component.MatchProbability(Pr m,j).Thisisdefinedastheprob-abilitythatpersonjinthePsamplewasmatchedtoacensuspersoninthesearcharea(orinaTESblock).The matchprobabilityistypically0or1,butitcantakeonval-ueswithinthisrangeduetomissingdataimputation.MoverStatus.EachpersoninthePsamplewasclassifiedasanonmover,outmover,orinmover.Nonmover.AnA.C.E.samplepersonwhosehousingunitonCensusDayandA.C.E.InterviewDayareidentical.Outmover.ApersonwhomovedoutofanA.C.E.housingunitbetweenCensusDayandthedateoftheA.C.E.inter-view.PSample.AlsoknownasthePopulationsample.ThePsampleconsistsofthosepersonsconfirmedtoberesi-dentsofthehousingunitsintheA.C.E.blockclustersas ofCensusDaybytheindependentportionoftheA.C.E.reinterviewandsubsequentoperations.ResidenceProbability(Pr res,j).Theprobabilitythatper-sonjontheP-samplefileisaresidentofthesample householdonCensusDay.AllinmoversareassumedtobeA.C.E.InterviewDayresidents.NonmoversandoutmoverscanbeCensusDaynonresidents,ifinformationindicates theywerenotaresidentofthesamplehouseholdbasedoncensusresidencyrules.Theresidenceprobabilityistypically0or1butitcantakeonvalueswithinthisrange duetomissingdataimputation.TargetedExtendedSearch(TES).A.C.E.operationinwhichblockclustersareidentifiedandselectedfora searchoftheimmediatesurroundingareatofindpersons geographicallymis-locatedinablockneighboringthe A.C.E.blockcluster.Moregenerally,itisthemethodology fortargeting,sampling,andimplementingthesearch operationsinthefield.DSEFormulaTheDSEforanygivenpost-stratumwascalculatedby:
DSEDD (CE N e)[N nN i (M n(M o N o)N i)]Allcountsandestimatesareforaspecificpost-stratumandthesubscriptsn,i,andostandfornonmovers,inmov-ers,andoutmovers,respectively.AdjustmentstothisDSE wereoccasionallymadetoavoidtheunlikelyeventthattheformularesultsindivisionbyzero.Forpost-stratawithlessthanten(unweighted)outmoverpersons,the ratioinsidethesquarebracketswaschangedtothefol-lowing: N nN o M nM o.CoverageCorrectionFactorFormulaThecoveragecorrectionfactor(CCF)isameasureofthenetovercountornetundercountofthehouseholdpopula-tionwithinthecensus.TheCCFforapost-stratumistheratiooftheDSEtothecensuscount:
CCFDSE Cwhere C=thefinalcensushouseholdpopulationcountwhereCDD+II+LA, IIthenumberofcensuspeoplewithinsufficient information, LAthenumberofpeopleadded(late)tothecensusandnotavailableforA.C.E.matching.LateAdds includebothdata-definedandnon-data-definedrecords.Note:ThenumeratoroftheCCFisbasedondata-definedpersons.Thedenominatorincludesdata-definedandnon-data-definedpersonsaswellaslateadds.Thus,weare implicitlyassumingthecoverageoflateaddsandnon-data-definedpersonsisthesameasthatfordata-definedpersons.Forexample,acoveragecorrectionfactorof1.05wouldimplythatforevery100peoplewithinthegiven post-stratum,thenetundercountisfivepersons.DSEComponentsEachcomponentoftheDSEisdescribednext.SectionIChapter77-3DualSystemEstimationU.S.CensusBureau,Census2000 DDisthecensuscount(unweighted)ofdata-definedper-sonsinthepost-stratum.TheestimatednumberofE-samplepersonsiswrittenas:
N eW j*jE samplewhereW j*=inverseoftheprobabilityofselection,includingafactorforTargetedExtendedSearch sampling.Theestimatednumberofcorrectenumerationsiscalcu-latedas: CEP r ce , j W j*jE samplewhere P r ce , j is:1ifpersonjcorrectlyenumerated,0ifpersonjNOTcorrectlyenumerated,or P r*ce , jifpersonjisunresolved,where P r*ce , j isestimatedthroughmissingdataimputation.Note:Probabilitiesforpersonswithunresolvedfinalcor-rectenumerationstatusintheEsampleorunresolved finalresidenceormatchstatusinthePsampleareassignedusingimputationcellestimationwithingroups.SeeChapter6fordetails.Withineachgroup,aprobability equaltoasimpleproportionisimputedforunresolved persons.Forexample,E-sample(orP-sample)personsinagroupwithunresolvedenumeration(match)statuswereassignedacorrectenumeration(match)probabilitythatis theproportionofcorrectenumerations(matches)amongpersonswithresolvedenumeration(match)statusinthegroup.TheprobabilitiesareestimatedintheDSEformulas
as: P r*m , jistheestimatedmatchprobabilityforunresolvedmatchstatus P r*res , jistheestimatedresidenceprobabilityforunresolvedresidencestatus P r*ce , jistheestimatedenumerationprobabilityforunresolvedenumerationstatusSomepersonsmovedbetweenCensusDayandA.C.E.InterviewDay.AmoverisapersonwhoselocationonthedayoftheA.C.E.interviewdiffersfromhisorherlocationonCensusDay.Thetreatmentofmovershasimportant ramificationsforestimation.Theattachmenttothischap-tertitledTheEffectofMoversonDualSystemEstimationprovidesadiscussiononalternativemethodologiesfor handlingmovers.ForCensus2000,moversweretreatedbyaprocedureknownasProcedureC,unlessapost-stratumhadlessthanten(unweighted)outmoverpersons.
Inthiscase,ProcedureAwasimplemented.ProcedureC identifiesallcurrentresidentslivingorstayingatthesampleaddressatthetimeoftheA.C.E.interview(non-moversandinmovers),plusallotherpersonswholivedat thesampleaddressonCensusDaywhohavesincemoved(outmovers).ThePsampleincludesnonmoversandout-movers.Foroutmovers,theinterviewersattempteda proxyinterviewtoobtaindatasuchasname,sex,andage thatwasusedformatching.Thematchrateforinmovers wasestimatedbythematchrateofoutmovers.Incon-trast,thenumberofmoversinthePsampleforA.C.E.
sampleareaswasestimatedbytheinmovers.Notethatno matchingwasdoneforinmovers.
N nistheweightedtotalpopulationfornonmoversforthepost-stratumfromthePsample.Theweightforeachper-sonjistheproductofthreevalues:1.theinverseoftheP-sampleselectionprobabilityincludingafactorfortheTargetedExtendedSearch sampling(W j*),2.anoninterviewadjustmentbasedonCensusDayinter-viewstatus(f*
c,j),and3.aCensusDayresidenceprobability(Prres,j).TheestimatednumberofP-samplenonmoversiscalcu-latedas: N nf*c , j P r res , j W j*jN onmoverswhere, P r res , j is: 1ifpersonjisaresidentonCensusDay, 0ifpersonjisNOTaresidentonCensusDay,or P r*res , jifpersonjisunresolved,where P r*res , j isestimatedthroughmissingdataimputation.Note:PersonswhowerenotresidentsonCensusDayarenotincludedinN nsincePrres,j=0isamultiplicativefac-torineachpersonscontributiontoN n.TheestimatednumberofP-samplenonmovermatchesiswrittenas:
M nP r m , j f*c , j P r res , j W j*jN onmoverswhere, P r m , j is: 1ifpersonjisamatchonCensusDay, 0ifpersonjisNOTamatchonCensusDay,or P r*m , jifpersonjisunresolved,where P r*m , j isestimatedthroughmissingdataimputation N iistheweightedtotalpopulationforinmoversforthepost-stratumfromthePsample.Theweightforeachper-sonjistheproductoftwovalues:1.theinverseoftheP-sampleprobabilityofselection (W j*asdefinedabove),and2.anoninterviewadjustmentfactorbasedonA.C.E.InterviewDaystatus(f*
a,j).TheestimatednumberofP-sampleinmoversisdenoted:
N if*a , j W j*jI nmovers7-4SectionIChapter7DualSystemEstimationU.S.CensusBureau,Census2000 NotethatallinmoversareassumedtobeA.C.E.InterviewDayresidents.TheestimatednumberofP-sampleoutmoversiswritten:
N of*c , j P r res , j W j*jO utmoversTheestimatednumberofP-sampleoutmovermatchesiscalculatedas:
M oP r m , j f*c , j P r res , j W j*jO utmoversSyntheticEstimationTheestimatedcoveragecorrectionfactorsforeachpost-stratumwereusedtoformsyntheticestimates.Synthetic estimationcombinescoverageerrorresultswithcensuscountsattheblockleveltoproduceadjustedblock-levelpopulationestimates.Thesyntheticmethodologyassumes coveragecorrectionfactorsdonotvarywithinapost-stratum.Asaresult,onecoveragecorrectionfactorisassumedtobeappropriateforallgeographicareaswithineachpost-stratum.Toobtainblock-levelsyntheticesti-mates,block-levelcensuscountsforpost-strataaremulti-pliedbypost-stratumcoveragecorrectionfactorsandaggregated.Thereisonecoveragecorrectionfactorfor eachpost-stratum,andeachpersoninablockisinonepost-stratum.Forexample,supposeallpersonsinablockfallintooneofsixpost-strata.Asyntheticestimatefor thisblockisformedbysummingtheproductofcensus countsforthatblockandpost-stratumwithitscorre-spondingcoveragecorrectionfactor.Acontrolledround-ingtechniquewasimplemented,resultinginthecreation ofpersonrecordsattheblocklevel.Subsequenttabula-tions,basedontheoriginalandreplicatedrecords,arecorrectedforcoverageerror.Adetaileddiscussionofsyn-theticestimationisprovidedinChapter8andHaines
(2001).POST-STRATIFICATION BackgroundThegoalofpost-stratificationfordualsystemestimationistoestablishgroupsofpersonswhoareexpectedto havesimilarcoverage.Acommonassumptionisthat peoplewhoaresubjecttosimilarhousing,language,edu-cation,andculturalattitudeswouldalsosharesimilarcen-suscoverage.Hogan(1993)indicatedthattenure,race andethnicorigin,age/sex,anddegreeofurbandevelop-mentwerereasonablemarkersforthesesimilaritiesinthe1990census.Anearliersectionnoted,however,thatthe independenceassumptionoftheDSEmodelcanbeinerrorduetoheterogeneouscaptureprobabilitieswithinapost-stratum.Post-strataareformedtosupportDSEby groupingpersonswithsimilarcensuscoverage,soasto reduceheterogeneityincaptureprobabilitiesforDSEs.Inmanysurveys,post-stratificationisdonetoreducevari-ancesandpartiallycorrectforproblemsinsamplingor undercoverage.ForDSE,theprimaryreasonforpost-stratificationistoreduceheterogeneitybias.Anyvariance reductionorsamplingbiascorrectionassociatedwith post-stratificationisabonus.Infact,theusualtrade-offis thatformingmanypost-stratareducesheterogeneityat theexpenseofaddingvariance.Asthenumberofpost-strataincreases,fewerpeopleinthecoveragemeasure-mentsurveyfallintoeachindividualpost-strata.Thepost-stratificationplanforCensus2000A.C.E.issum-marizedinthissection.Also,thedetaileddefinitionsof thepost-stratificationvariablesandtherace/Hispanicori-gindomainsaregiven.SeeHaines(2001b)forfurtherdetails.The2000A.C.E.differsfromthe1990Post-EnumerationSurvey(PES)inthatithasapproximatelytwicethesamplesizeofthePES.Thislargersamplesizepermittedtheformationofmorepost-stratathathasthe advantageofreducingcorrelationbias,aswellassam-plingvariance.Additionallyin2000,multipleresponsestotheracequestionwerepermitted;in1990onlyonerace couldbeselected.The1990PESpost-stratastartedwithacross-classificationofsevenvariables:age,sex,race,Hispanicorigin,tenure, urbanicity,andregion.Therewere840cellsinthecross-classification.Collapsingwasnecessaryinordertopro-ducepost-stratawithsufficientsampleforreliableDual SystemEstimation(DSE).Thecollapsingreducedthenum-berofpost-stratato357.RaceandHispanicoriginwereconsideredthemostimpor-tantvariablestoretainin1990.Aftercollapsing,fiverace/Hispanicoriginpost-strataweremaintained:Non-HispanicWhiteorOther,Black,HispanicWhiteorOther, AsianandPacificIslander,andReservationIndians.Off-reservationAmericanIndianswereplacedineithertheNon-HispanicWhiteorOthergrouportheHispanicWhite orOthergroup,dependingonwhethertheywereofHis-panicorigin.Withineachoftheserace/Hispanicoriginpost-strata,sevenage/sexcategoriesweremaintained.Theothervariableswerecollapsedinthefollowingorder:region,urbanicity,thentenure,ifnecessary.ForAmericanIndiansresidingonreservations,allthesevariableswere collapsed.ForAsianandPacificIslanders,regionandurba-nicitywerecollapsedandtenuremaintained.FortheBlackandHispanicWhiteorOthergroups,regionwascollapsed fortwolevelsofurbanicity.ForNon-HispanicWhiteorOther,thefullcross-classificationofregion,urbanicityandtenureweremaintained.GriffinandHaines(2000b)pro-videsadetailedtableonthe1990PESpost-stratification.Post-StratificationPlanTheCensus2000A.C.E.retainedmostofthe1990PESpost-stratificationvariablesandincludedseveraladdi-tionalones.Ninevariableswereusedin2000:age,sex,SectionIChapter77-5DualSystemEstimationU.S.CensusBureau,Census2000 race,Hispanicorigin,tenure,region,MetropolitanStatisti-calAreasize/TypeofEnumerationArea,andtract-level returnrate.TheMetropolitanStatisticalAreasizevariable replacedtheurbanicityvariablethatwasnotavailable untilthesummerof2001.TypeofEnumerationArea(TEA) andthetractreturnrateweretwonewfeaturesofthe 2000A.C.E.post-stratification.Themailout/mailbackareasweredifferentiatedfromothertypesofenumerationareas.Inaddition,tractswereclassifiedbyhighorlowreturnrates.Multipleresponsestotheracequestionwere reflectedintheraceandHispanicorigingroupings.Table7-2showsthe64post-stratumgroupsfortheCen-sus2000A.C.E.Withineachpost-stratumgroup,thereare sevenage/sexgroups(showninTable7-3).Thus,therewasamaximumof64 x7=448post-strata.TheP-samplesizewastoosmallorthesamplingvariancetoohighfor eightofthe64post-stratumgroups.Foreachofthese eightgroups,the7age/sexpost-stratawerecollapsed into3post-strata(under18;males18+andfemales18+).Asaresult,directDSEswerecalculatedwithineachof416post-strata,whichwereexpandedto448DSEsusingsyn-theticestimationforthecollapsedgroups.Thepost-stratificationplanwaschosentoreducecorrelationbias withouthavinganadverseeffectonthevarianceofthe dualsystemestimator.Followingisadetaileddescription ofthepost-stratificationvariablesincludinganexplana-tionoftherace/Hispanicorigindomainassignment7-6SectionIChapter7DualSystemEstimationU.S.CensusBureau,Census2000 Table7-2.Census2000A.C.E.64Post-StratumGroups(U.S.)Race/Hispanicorigindomainnumber*TenureMSA/TEAHighreturnrateLowreturnrateNEMWSWNEMWSWDomain7(non-HispanicWhiteor Someotherrace)OwnerLargeMSAMO/MB0102030405060708MediumMSAMO/MB0910111213141516 SmallMSA&Non-MSAMO/MB1718192021222324AllotherTEAs2526272829303132NonownerLargeMSAMO/MB3334MediumMSAMO/MB3536 SmallMSA&Non-MSAMO/MB3738 AllotherTEAs3940Domain4(Non-HispanicBlack)OwnerLargeMSAMO/MB4142MediumMSAMO/MBSmallMSA&Non-MSAMO/MB4344AllotherTEAsNonownerLargeMSAMO/MB4546MediumMSAMO/MBSmallMSA&Non-MSAMO/MB4748AllotherTEAsDomain3 (Hispanic)OwnerLargeMSAMO/MB4950MediumMSAMO/MBSmallMSA&Non-MSAMO/MB5152AllotherTEAsNonownerLargeMSAMO/MB5354MediumMSAMO/MBSmallMSA&Non-MSAMO/MB5556AllotherTEAsDomain5(NativeHawaiianor PacificIslander)
Owner 57 Nonowner 58Domain6(Non-HispanicAsian)
Owner 59 Nonowner 60 AmericanIndianor Alaska NativeDomain1 (On Reservation)
Owner 61 Nonowner 62Domain2(Off Reservation)
Owner 63 Nonowner 64*ForCensus2000,personscanself-identifywithmorethanoneracegroup.Forpost-stratificationpurposes,personsareincludedinasingleRace/HispanicOriginDomain.Thisclassificationdoesnotchangeapersonsactualresponse.Further,allofficialtabulationsarebasedonactual responsestothecensus.SectionIChapter77-7DualSystemEstimationU.S.CensusBureau,Census2000 Table7-3.Census2000A.C.E.Age/SexGroupsMaleFemaleUnder18118to292330to494550+67Post-stratificationVariablesThissectiongivesadetaileddescriptionofthepost-stratificationvariablesincludingthehandlingofmultipleresponsestotheracequestion.A.C.E.post-stratification usedthefollowingvariables:*race/Hispanicorigin-sevencategories*age/sex-sevencategories*tenure-twocategories
- MetropolitanStatisticalArea(MSA)byTypeofEnumera-tion(TEA)-fourcategories*returnrate-twocategories
- region-fourcategoriesThesevenrace/Hispanicorigindomainswere:
- AmericanIndianorAlaskaNativeonReservations*Off-ReservationAmericanIndianorAlaskaNative*Hispanic
- Non-HispanicBlack
- NativeHawaiianorPacificIslander*Non-HispanicAsian*Non-HispanicWhiteorSomeotherrace Inclusioninarace/Hispanicorigindomainiscomplicated,asitdependsonseveralvariablesandwhetherthereare multipleraceresponses.Inaddition,inclusionina race/Hispanicorigindomaindoesnotchangeapersonsrace/Hispanicoriginresponse.AllCensus2000tabula-tionsarebasedontheactualresponses.Forexample,a personwhorespondedasAmericanIndianonareserva-tionandBlackwasplacedinthefirstrace/Hispanicorigincategory(Domain1)forpost-stratificationpurposes,but wastabulatedinthecensusasAmericanIndian/Black.Thesevenage/sexcategorieswere:1.Under182.18-29male 3.18-29female4.30-49male 5.30-49female6.50+male7.50+femaleThetwotenurecategorieswere:1.Owner2.NonownerThefourMSA/TEAcategorieswere:1.LargeMSAMailout/Mailback(MO/MB)2.MediumMSAMO/MB3.SmallMSAorNon-MSAMO/MB 4.AllotherTEAsMSA/CMSAFIPScodes,asdefinedbytheOfficeofManage-mentandBudget,wereusedforpost-stratification.For simplification,MSA/CMSAwillhereinbereferredtoasMSA.LargeMSAconsistsofthetenlargestMSAsbasedonunadjusted,Census2000totalpopulationcountsinclud-ingthepopulationinGroupQuarters.MediumMSAsare those(besidesthelargest10)thathaveatleast500,000 totalpopulation.SmallMSAsarethosewithatotalpopula-tionsizelessthan500,000.Forpost-stratificationpur-poses,MO/MBareaswerecontrastedwiththenon-MO/MB areas.Thetworeturnratecategorieswere:1.High2.LowReturnrateisatract-levelvariablemeasuringthepropor-tionofoccupiedhousingunitsinthemailbackuniversethatreturnedacensusquestionnaire.Low(high)returnratetractsarethosetractswhosereturnrateislessthan orequalto(greaterthan)the25thpercentilereturnrate.Separate25thpercentilecut-offvalueswereformedforthesixapplicablerace/Hispanicoriginbytenuregroups.
PersonsinList/Enumerate,RuralUpdate/Enumerate,and UrbanUpdate/EnumerateTEAswereautomaticallyplacedintheHighcategory.Thefourregioncategorieswere:1.Northeast2.Midwest 3.South4.West Pre-CollapsingPre-collapsingwasdonepriortodatacollectionandknowledgeoftheexactsamplesizeineachpost-stratum.
Allrace/Hispanicorigin,age/sex,andtenurecategoriesfortheU.S.wereinitiallymaintained.Theresearchforthedeterminationoftheimportantpost-stratificationvariables7-8SectionIChapter7DualSystemEstimationU.S.CensusBureau,Census2000 providedinformationontheexpectedsamplesizeineachcategorywhichwasthenusedtodefineacollapsinghier-archy.Thepre-collapsingplanfortheregion,MSA/TEA andreturnratevariableswasasfollows:*Non-HispanicWhiteorSomeotherraceOwners:No collapsing.*Non-HispanicWhiteorSomeotherraceNon-owners:Regionwaseliminated.*Non-HispanicBlack:Regionwaseliminated.InadditiontherewaspartialcollapsingoftheMSA/TEAvariablewithinreturnrateandtenurecategories.*Hispanic:Regionwaseliminated.InadditiontherewaspartialcollapsingoftheMSA/TEAvariablewithinreturn rateandtenurecategories.*NativeHawaiianorPacificIslander:Theregion,returnrateandMSA/TEAvariableswereeliminated.Onlyten-ureandage/sexwereretained.*Non-HispanicAsian:Theregion,returnrateandMSA/TEAvariableswereeliminated.Onlytenureandage/sexwereretained.*AmericanIndianorAlaskaNativeonReservations:Theregion,returnrateandMSA/TEAvariableswereelimi-nated.Onlytenureandage/sexwereretained.*Off-ReservationAmericanIndianorAlaskaNative:Theregion,returnrateandMSA/TEAvariableswereelimi-nated.Onlytenureandage/sexwereretained.
Post-CollapsingA.C.E.post-stratificationincludedaplantocollapsepost-stratathatcontainedlessthan100(unweighted)P-samplepersons,calledpost-collapsing,consideringsuchapost-stratumtoosmalltoproducereliableestimates.Ifacol-lapsedpost-stratawasstilltoosmall,itcouldhavebeenfurthercollapsed.Thecollapsingprocedurewashierarchi-calandrequiredapre-definedcollapsingorder.Giventhe pre-collapsingplanthatyielded448post-strata,notmuch post-collapsingwasanticipated,butanextensivepost-collapsingstrategywasdesignedforcompletenessandtosatisfytherequirementofpre-specification.Notethatcollapsingdoesnotnecessarilyimplyelimina-tionofavariable.Collapsingcanrefertoareductioninthenumberofcategoriesforavariable.Thefollowinggeneraloutlinedescribesthepost-collapsinghierarchy thatwasplanned:*Ifanyofthe448post-strataaretoosmall,collapseage/sexfirst.Thismeansthatwithinanyofthe64U.S.post-stratumgroups,ifatleastoneoftheseven age/sexcategoriesdefinedinTable7-3haslessthan100P-samplepersons,reduceage/sextothefollowingthreecategories:Under18,18+male,and18+female.*Ifsomepost-strataarestilltoosmallandrequirecol-lapsing,collapseregionnext,ifapplicable.Thiscollaps-ingappliesonlytotheNon-HispanicWhiteorSome otherracedomainsincethevariableregionisonly includedintheirpost-stratificationdefinition.Inthis case,alllevelsofregion(Northeast,Midwest,South, West)arecombinedtoeliminatethevariable.*Next,collapsethefour-levelMSA/TEAvariable,intothefollowingtwogroups:*LargeandmediumMSAMO/MB
- SmallMSAandnon-MSAMO/MBandallotherTEAs*Iffurthercollapsingisnecessary,returnrateisthenextvariabletocollapse.HighandLowreturnratecatego-riesarecombinedtoeliminatethevariable.*NextcollapsethevariableMSA/TEA.Ifnecessary,thetwogroupsdefinedabovewouldbecombinedtogethertoeliminatethevariableMSA/TEAcompletely.*Thenextvariabletocollapseistenure.Ownerandnon-ownercategoriesarecombinedtoeliminatethevariable entirely,ifnecessary.*Ifcollapsingisstillneeded,thethreeremainingage/sexpost-strataarecombinedtoeliminatetheage/sexvari-ablecompletely.*Intheeventthattherearelessthan100P-samplepersonsinarace/Hispanicorigindomain,combineallpersonsinthatdomainwithDomain7,whichincludesnon-HispanicWhiteandSomeotherrace.Inpractice,onlythefirststepofcollapsingwasnecessary.Eightofthe64post-stratumgroupshadtheir7age/sexpost-stratacollapsedto3age/sexgroups,resultingin32fewerpost-strata.Thus,therewere448-32=416post-
strata.RaceandHispanicOriginClassificationsTheCensus2000questionnairehas15possibleraceresponses.The15responsesarecollapsedintosixmajorracegroupsasshownbelow.Racesthatareincludedinthemajorgroupsareshowninparentheses.Personsself-identifyingwithasingleraceessentiallyplacethemselves intooneofthesesixcategories.*White*Black(Black,AfricanAmerican,Negro)
- AmericanIndianorAlaskaNative
- Asian(AsianIndian,Chinese,Filipino,Japanese,Korean,Vietnamese,OtherAsian)*NativeHawaiianorPacificIslander(NativeHawaiian,GuamanianorChamorro,Samoan,OtherPacific Islander)SectionIChapter77-9DualSystemEstimationU.S.CensusBureau,Census2000
- Someotherrace(Therewasaboxonthequestion-nairelabeledSomeotherrace-Printracewithalineto enteranyracetherespondentdesired.)Forthefirsttimeincensushistory,personswereabletorespondtomorethanoneracecategory.Allowingpersons toself-identifywithmultipleracesresultsinmanymorethansixracegroups.Infact,aftercollapsingracetothesixmajorgroups,thereare2 6-1=63possibleracecom-binations.Itisnecessarytosubtractthe1inthisequationsinceeachindividualisassumedtohavearace.Theracevariabledefinedaboveisoftencross-classifiedwiththeHispanicoriginvariabletodefinepost-strata.TheHispanicoriginvariableconsistsoftworesponses,Noand Yes.CategoriesthatareincludedintheYesresponseareshowninparentheses.1.No,notSpanish/Hispanic/Latino2.Yes(Mexican,MexicanAmerican,Chicano,PuertoRican,Cuban,OtherSpanish/Hispanic/Latino)CombiningtheraceandHispanicoriginvariablesyields 63x2=126possiblerace/Hispanicorigingroups.ItisimportanttonotethatinasurveythesizeofA.C.E.,nopost-stratificationplanofinterestcansupport126race/Hispanicorigingroups.Consequently,eachofthe 126race/Hispanicoriginresponsepossibilitieswas assignedtooneofsevenrace/Hispanicorigindomains.Thesevenrace/Hispanicorigindomainsaredefinedas follows:1.AmericanIndianorAlaskaNativeonReservations2.Off-ReservationAmericanIndianorAlaskaNative3.Hispanic 4.Non-HispanicBlack5.NativeHawaiianorPacificIslander6.Non-HispanicAsian 7.Non-HispanicWhiteorSomeotherraceNotethatmissingraceandHispanicorigindataareimputed.Therulesusedtoclassifythe126raceandHis-panicorigincombinationsintooneoftheseven race/Hispanicorigindomainsarenowpresented.Manyofthedecisionsonhowmultipleracepersonswereclassifiedarebasedoncultural,linguistic,andsociologicalfactors, whichareknowntoaffectcoverageandarenotnecessar-ilydata-driven.Ahierarchywasusedtoassignpersonstoarace/Hispanicorigindomain.Therace/Hispanicorigindesignation occursinthefollowingorder:AmericanIndianorAlaskaNativeonReservations,Off-ReservationAmericanIndianorAlaskaNative,Hispanic,Non-HispanicBlack,Native HawaiianorPacificIslander,Non-HispanicAsian,andNon-HispanicWhiteorSomeotherrace.Thiscollapsingwasonlyusedforthepost-stratification,allcensusdataweretabulatedinaccordancewiththeraceandHispanicorigin categoriesselectedbycensusrespondents.Forthefollowingtables,IndianCountry(IC)isablock-levelvariablethatindicateswhetherablockis(whollyorpartly)insideanAmericanIndianreservation/trustland,OklahomaTribalStatisticalArea(OTSA),TribalDesignated StatisticalArea(TDSA),orAlaskaNativeVillageStatisticalArea(ANVSA).Tables7-4and7-5displaytheassignmentofrace/Hispanicorigindomains.Table7-4appliestoHispanicpersons,whileTable7-5appliestonon-Hispanicpersons.
ThefirstsixrowsofTables7-4and7-5correspondtoasingleraceresponse.Theremainingportionofthetablesaddresstheassignmentofmultipleraceresponsestoa singlerace/Hispanicorigindomain.Althoughapersonmaybeassociatedwithmultipleraceresponses,eachpersonisincludedinonlyoneofthesevenrace/Hispanic origindomains.Allpersonswithacommonnumberareassignedtothesamerace/Hispanicorigindomain.Thenumberforeachrace/Hispanicorigindomainwas assignedasfollows:Domain1(IncludesAmericanIndianorAlaskaNativeonReservations).Thisdomainincludesanyper-sonlivingonareservationmarkingAmericanIndianorAlaskaNativeeitherastheirsingleraceorasoneofmanyraces,regardlessoftheirHispanicorigin.Domain2(IncludesOff-ReservationAmericanIndianorAlaskaNative).Thisdomainincludesanypersonliv-inginIndianCountry,butnotonareservationwhomarksAmericanIndianorAlaskaNativeeitherasasingleraceor asoneofmanyraces,regardlessoftheirHispanicorigin.ThisdomainalsoincludesanyNon-HispanicpersonnotlivinginIndianCountrywhomarksAmericanIndianor AlaskaNativeasasinglerace.Domain3(IncludesHispanic).ThisdomainincludesallHispanicpersonswhoarenotincludedinDomains1or2.AllHispanicpersons(excludingAmericanIndianorAlaska NativeinIndianCountry)areincludedinDomain3.TheonlyexceptiontothisruleoccurswhenaHispanicpersonlivesinthestateofHawaiiandclassifieshimselforherself asNativeHawaiianorPacificIslander,regardlessof whetherheorsheidentifieswithasingleormultiplerace.AllHispanicpersonssatisfyingthisconditionarere-classifiedintoDomain5.Domain4(IncludesNon-HispanicBlack).Thisdomainincludesanynon-HispanicpersonwhomarksBlackas theironlyrace.ItalsoincludesthecombinationofBlack andAmericanIndianorAlaskaNativenotinIndianCoun-try.Inaddition,peoplewhomarkBlackandanothersingleracegroup(NativeHawaiianorPacificIslander,Asian, White,orSomeotherrace)areincludedinDomain4.7-10SectionIChapter7DualSystemEstimationU.S.CensusBureau,Census2000 TheonlyexceptiontothisruleoccurswhenaNonHispanicBlackpersonlivesinthestateofHawaiiandclassifieshim-selforherselfasNativeHawaiianorPacificIslander.All Non-HispanicBlackpersonssatisfyingthisconditionare reclassifiedintoDomain5.Domain5(IncludesNativeHawaiianorPacific Islander).ThisdomainincludesanyNon-HispanicpersonmarkingthesingleraceNativeHawaiianorPacific Islander.ForNonHispanicpersons,italsoincludestheracecombinationofNativeHawaiianorPacificIslanderandAmericanIndianorAlaskaNativenotinIndianCountry.
AlsoincludedistheracecombinationofNativeHawaiianorPacificIslanderwithAsianforNon-Hispanicpersons.All personslivinginthestateofHawaiiwhoclassifythem-selvesasNativeHawaiianorPacificIslander,regardlessoftheirHispanicoriginandwhethertheyidentifywithasingleormultiplerace,arealsoincludedinDomain5.Domain6(IncludesNon-HispanicAsian).Thisdomainincludesanynon-HispanicpersonmarkingAsianastheir singlerace.Ifapersonself-identifieswithAsianandAmericanIndianorAlaskaNativenotinIndianCountry,theyareincludedinDomain6.Domain7(IncludesNon-HispanicWhiteorSomeotherrace).Non-HispanicWhiteorNon-HispanicSomeotherracepersonsareincludedinDomain7.Non-Hispanicpersonswhoself-identifywithAmericanIndianorAlaskaNativenotinIndianCountryandareWhiteorSomeotherraceareclassifiedintoDomain7.IfaNative HawaiianorPacificIslanderresponseiscombinedwithaWhiteorSomeotherraceresponse,theyalsoare includedinDomain7.Apersonwhoself-identifieswithAsianandWhiteorAsianandSomeotherraceisalsoincludedinthisdomain.Finally,allNon-Hispanicpersonswhoself-identifywiththreeormoreraces(excluding AmericanIndianorAlaskaNativeinIndianCountry)areincludedinDomain7.TheonlyexceptiontothisruleoccurswhenaNon-HispanicWhiteorNon-HispanicSome otherracepersonlivesinHawaiiandclassifiesthem-selvesasNativeHawaiianorPacificIslander,regardlessofwhethertheyidentifywithotherraces.Personswhosat-isfythiscriteriaarere-classifiedintoDomain5.SectionIChapter77-11DualSystemEstimationU.S.CensusBureau,Census2000 Table7-4.Census2000A.C.E.Race/OriginPost-stratificationDomainsforHispanicIndiancountry(IC)NotinICIndiancountry(IC)Noton reservation On reservationSinglerace:AmericanIndianorAlaskaNative............321 Black.....................................333NativeHawaiianorPacificIslander..........*333 Asian.....................................333White....................................333Someotherrace.........................333AmericanIndianorAlaskaNativeand:
Black.................................321NativeHawaiianorPacificIslander......*321Asian.................................321White................................321Someotherrace.....................321Blackand:NativeHawaiianorPacificIslander......*333Asian.................................333White................................333Someotherrace.....................333NativeHawaiianorPacificIslanderand:Asian.................................*333White................................*333Someotherrace.....................*333Asianand:White................................333Someotherrace.....................333AmericanIndianorAlaskaNativeand:TwoorMoreRaces....................*321AllElse**...............................*333*AllpersonslivinginthestateofHawaiiwhoclassifythemselvesasNativeHawaiianorPacificIslander,regardlessoftheirHispanicoriginandwhethertheyidentifywithasingleormultiplerace,areincludedinDomain5,whichincludesNativeHawaiianorPacificIslander.**AllElseencompassesallremainingcombinationsthatexcludeAmericanIndianorAlaskaNative.7-12SectionIChapter7DualSystemEstimationU.S.CensusBureau,Census2000 Table7-5.Census2000A.C.E.Race/OriginPost-stratificationDomainsforNon-HispanicNotinICIndiancountry(IC)Noton reservation On reservationSinglerace:AmericanIndianorAlaskaNative............221 Black.....................................444NativeHawaiianorPacificIslander..........555 Asian.....................................666White....................................777Someotherrace.........................777AmericanIndianorAlaskaNativeand:
Black.................................421NativeHawaiianorPacificIslander......521Asian.................................621White................................721Someotherrace.....................721Blackand:NativeHawaiianorPacificIslander......*444Asian.................................444White................................444Someotherrace.....................444NativeHawaiianorPacificIslanderand:Asian.................................555White................................*777Someotherrace.....................*777Asianand:White................................777Someotherrace.....................777AmericanIndianorAlaskaNativeand:TwoorMoreRaces....................*721AllElse**...............................*777*AllpersonslivinginthestateofHawaiiwhoclassifythemselvesasNativeHawaiianorPacificIslander,regardlessoftheirHispanicoriginandwhethertheyidentifywithasingleormultiplerace,are includedinDomain5,whichincludesNativeHawaiianorPacificIslander.**AllElseencompassesallremainingcombinationswhichexcludeAmericanIndianorAlaskaNative.SectionIChapter77-13DualSystemEstimationU.S.CensusBureau,Census2000 VARIANCEESTIMATIONTheA.C.E.samplewasconsideredathree-phasesampletheinitiallistingsamplewasthefirstphase; A.C.E.reductionandsmallblockclustersubsamplingwas thesecondphase;andTargetedExtendedSearch(TES) wasthethirdphase.Multiphasesamplingdiffersfrom multistageinthefollowingway.Inamultistagedesign, theinformationneededtodrawallstagesofthesampleis knownbeforethesamplingbegins;inamultiphase design,theinformationneededtodrawanyphaseofthe sampleisnotavailableuntilthepreviousphaseiscom-pleted.Becauseofthemultiphasenatureofthedesign (housingcountsnotavailableuntilafterthefirst-phase listing),anewvarianceestimatorneededtobedeveloped.
FulldetailsaregiveninStarsinicandKim(2001).OurgoalistoobtainavarianceestimatorfortheDualSys-temEstimator(DSE),oftheform:
DSEDD (CE N e)(N nN i M n(M o N o)N i)(1)where: DDnumberofcensusdata-definedpersons CEestimatednumberofA.C.E.E-samplecorrectenumerations N eestimatednumberofA.C.E.E-sample persons N nestimatednumberofA.C.E.P-sample nonmovers N iestimatednumberofA.C.E.P-sample inmovers N oestimatednumberofA.C.E.P-sample outmovers M nestimatednumberofA.C.E.P-samplenonmovermatches M oestimatednumberofA.C.E.P-sampleoutmovermatchesTheDSEiscomputedseparatelyforeachpost-stratumdenotedbyh.Thenationalcorrectedpopulationestimateiscomputedas:
TUSh DSE h (2)Thereisnoclosed-formsolutionforthevarianceestima-tor,andtheTaylorlinearizationvarianceestimatorisverycomplex.Thatleavesreplicationmethodologyastheonly practicalvarianceestimator.Specifically,astratifiedjack-knifeestimatorwasthetypeofreplicationmethodchosenfortheimplementation.Ajackknifeestimatoriscalculatedfromasetofreplicateswherethenumberofreplicatesisequaltothenumberofobservations(clustersinthiscase)inthesample.Each replicaterepresentswhattheDSEwouldhavebeenhadeachparticularclusternotbeenpartofthesample.Theoverallvarianceiscalculatedbysummingthesquaresof thedifferencesbetweenthereplicateDSEandthewhole-sampleDSE.ThemostimportantchallengefortheCensus2000A.C.E.varianceestimationwasthepreciseformforcalculatingthecontributionofreplicateDSEstothevarianceestima-tor;inparticular,newweightshadtobecalculatedforrep-licatestorepresenttheeffectofremovingtheclusterwhosereplicatewasbeingcalculated.NopreviousresultsweredirectlyapplicabletotheDSE,butamethodology wasdevelopedbasedontheworkofRaoandShao(1992).Theremainingpartofthissectiondescribesthepreciseformulasindetail.Theyrequiresomewhatcomplexnota-tionandmathematicalsteps.DetailedMethodologyAgeneralestimatorofatotalis:
T yi w i y i (3)Theestimatorforthe j threplicateis T yii w ijy i (4)wherey iisthecharacteristicofinterest,andw i (j)isthereplicateweightforthei thunit,whichdiffersfromtheoriginalweightinaprespecifiedsubsetoftheobserva-tions.Withthesereplicateestimators,avarianceestimator canbeconstructed:
V arT yj c jT yiT y2 (5)Beforecontinuing,wemustsetdownsomespecificnota-tion.Letw ibethefirstphasesamplingweight,andlety ibethecluster-leveltotalofanyofthesevenestimated componentsoftheDSE(CE,N n,etc.).LetAandA 2 indicatethefirstandsecondphasesamples,respectively.Letx ig=1ifunitiisingroup(secondphasestratum)gandzerootherwise.Letn hbethenumberofunitsselectedinfirst-phasestratumh.Letn gbethenumberofunitsinstratumhthatarealsoingroupg,andletr gbethenumberofthe n gunitsselectedinthesecondphase.Inallofthefollow-ingequations,jwillrepresentoneclusterthatisbeingdroppedtocalculateitsassociatereplicateestimateT (j);kisoneclusterotherthantheonebeingdropped.Fortwo-phasestratifiedsampling,therearetwodifferentpointestimators,theDoubleExpansionEstimator(DEE)
DEEgiA 2 n g r g w i x ig y i (6)andtheReweightedExpansionEstimator(REE)
REEgi2 (iw i x igiA 2 w i x ig)w i x ig y i (7)7-14SectionIChapter7DualSystemEstimationU.S.CensusBureau,Census2000 ThereisanestablishedresultbyRaoandShao(1992)whichgivesareplicatevarianceestimatorfortheREE undertwo-phasestratifiedsampling.Unfortunately,allthe individualcomponentsoftheDSE,suchasN e,thenumberofE-samplepeople,areDEEs.TakingacloserlookattheDEE,however,suggestedaprocedurethatcouldbe
applied.n giA x ig , r giA 2 x ig DEEgiA 2 n g r g w i x ig y igi2 (kx kgk2 x kg)w i x ig y igi2 (w k x kg w k1k2 w k x kg w k1)w i x ig y i (8)TheDEEhasjustbeenrewritteninaformthatisquitesimilartotheREE.Thissuggeststhefollowinggeneraliza-
tion: T y 2i2i y i , whereig (kw k x kg q kk2 w k x kg q k)w i x ig (9)andwhereq j=1fortheREEandw i-1fortheDEE.Replicatesarethennaturallywrittenas:
T y 2jiA 2ijy i , whereijg (kw kjx kg q kk2 w kjx kg q k)w ijx ig (10)Whenq j=1(i.e.theREEcase),thereplicatevarianceesti-matorofthisgeneralizedestimator,basedonequation(5),isthesameastheREEreplicatevarianceestimatorofRaoandShao(1992).ApplicationToaThree-PhaseDualSystem EstimatorWithinanyofthesevencomponentsoftheDSEthataresubjecttosamplingerror(CE,N e ,N n ,N o ,N i ,M n,andM o),theclustersums(y i)canbebrokendownintotwocompo-nents:thetotalpriortoanyadjustmentsmadebyTES(u i),andtheadditionaltotalfromtheTESsample(v i).Thissec-ondpiececanbefurthersubdividedintoTEStotalsfromclusterssampledwithcertainty,andTEStotalsfromclus-terssampledsystematically.Theestimator(aDEE)ofoneofthecomponentsis Ty 3iA 2i u ik1 2i2i t k s ik a i v i (11)wheres ikisthethirdphasestratumindicator(s ik=1iftheclusterisselectedwithcertainty,0otherwise;s i2=1-s i1,anindicatorthattheclusteriseligibletobeselectedsystem-atically),a iisthethirdphasesampleindicator(a i=1iftheclusterisinA 3,0otherwise),andt k,theTESconditionalweight,isequalto t kiA 2 s iki2 s ik a iiA 2 s iki3 s iknumberofclusersselectedinphase2numberofclustersselectedinphase3 (12)Fors i1,thecertaintystratum,allclusterswithinithave a i=1,sot k=1forallclustersinthestratum.Tocreatethereplicateestimator,simplyapplywhatwaslearnedaboveinequations(8)and(10).
Ty 3jiA 2iju i+k1 2iA 2ijt kjs ik a i v i (13)iA 2iju iiA 2ijt ljs il a i v iiA 2ijt 2js i 2 a i v iwhere, t 1j1 t 2jiA 2ijs i 2i1iA 2ijs i 2 a ii1ImplementationofVarianceEstimationforthe A.C.E.Thefirststepinimplementingthisvarianceestimationmethodologyiscalculatingthereplicateweights.Tothis point,themethodofreplicationusedtoarriveatthevari-anceisimmaterial,butwewillnowstatethatthejack-knifewillbeused.Letthereplicateweightsafterthefirst stageofsamplingbethestandardjackknifereplicate weights w ij{0 n h n h1ifij w hiifiandjareinthesamefirstphasestratum w hi otherwise (14)Then,thefinalweightsareobtainedbyapplyingequation (10).Notethatthisisanunusualformofthejackknife.Nor-mally,thejackknifehasasmanyreplicatesasobserva-tions.Here,thereare11,303clustersremainingafterthesecondphaseofthesample,butthenumberofreplicates isequaltothefirstphasessamplesizeof29,136clusters.
Theclusterssampledoutinthesecondphaseobviouslydonotcontributetothevarianceduetothesecondandthirdphases,buttheymustbeincludedtoaccuratelySectionIChapter77-15DualSystemEstimationU.S.CensusBureau,Census2000 accountforthefirstphaseofsampling.Deletingaclus-terthatwassampledoutchangestheweightsoftheother clustersthatwereinthesamefirstphasesamplingstra-
tum.Thesecondstepoftheimplementationistoadjusttheimputationofcertainprobabilitiestoaccountfortherepli-cation.Thisisacomponentofthevariancethatcanbeaccountedforbyincludingtheeffectofthereplicate weightsintheimputation.Forsomepersons,theirmatch,residence,orcorrectenumerationstatusremainsunre-solvedevenafterfollow-upoperations.Inthesecases,a probabilityforeachunresolvedstatusisimputedusinganimputationcelltechnique,witheachunresolvedcaseinanimputationcellgettingthesameimputedprobability.The generalformforthereplicatedimputationoftheprob-abilityforanunresolvedpersoninimputationcellkis:
P r k*jresolvedpk w p*jt p*jP r presolvedpk w p*jt p*j(15)wherethesummationisoverallresolvedpersonsinimpu-tationcellk,and:
w p*=person-levelweightforreplicatej,incorporatingallsamplingoperationsexceptTES,andnotincluding thenoninterviewadjustment t p*j{conditionalTESweightforreplicatej,theinverseoftheprobabilityofselectionintheTESsample,ifthepersonisaTESperson1ifthepersonisNOTaTESperson P r p{1ifapersonisamatchresidentcorrectenumeration0ifapersonisNOTamatchresidentcorrect enumeration}Tocompletetheestimationofthevariances,the29,136replicatedualsystemestimateswerecomputedforeach ofthe448post-strata:
DSE hjCII(CEjN ej)(N njN ijM nj(M ojN oj)N ij)(16)Equation (13)wasusedfortheseparatecomputationofeachofthesevenreplicatedtermsoftheDSE:CE (j),N e (j), N n (j),N i (j),N o (j),M n (j),andM o (j).Thevarianceestimatesforpost-stratumhusedformula (5): V arDSE hj n 1, i1 n 1, iDSE hjDSE h2 (17)finally,thevarianceofthenationaladjustedpopulationestimateis:
V arTUSpoststratumhpoststratumh'C ovDSE h,DSE h', where C ovDSE h,DSE hV arDSE h, and (18)C ovDSE h,DSE h'j n 1, i1 n 1, iDSE hjDSE hDSE h'jDSE h')Covariancesexistbetweenpost-stratamostlybecauseofcorrelationsbetweenmembersofthesamehousehold beingindifferentpost-stratabuthavingthesameprob-abilityofbeingincludedinthesample.Forinstance,withinagivenrace/Hispanicorigin/tenure/regiongroupthereexistssomecovarianceamongmales30-49,females 30-49andchildren0-17,becausesuchpersonsarelikelytoliveinthesamehousehold,andhence,showverysimi-larcensusandA.C.E.inclusionprobabilities.RESULTSThepercentnetundercount(UC)istheestimatednetundercount(ornetovercount)dividedbythedualsystem estimateforapost-stratumexpressedasapercentage.Apositivenumberimpliesundercoverage,whileanegativenumberimpliesovercoverage.Thepercentnetundercount forCensus2000showninthisdocumentisstrictlyforthehouseholdpopulationandexcludesgroupquartersper-sons.UC(DSEC DSE)100Table7-6presentstheestimatedpercentnetundercountforeachofthe64post-stratumgroups.Table7-7presentsthestandarderrorofeachoftheseestimates.ManymoreresultsareavailableinDavis(2001).7-16SectionIChapter7DualSystemEstimationU.S.CensusBureau,Census2000 Table7-6.Census2000A.C.E.64Post-StratumGroups-PercentNetUndercountRace/Hispanicorigindomainnumber*TenureMSA/TEAHighreturnrateLowreturnrateNEMWSWNEMWSWDomain7(non-HispanicWhiteor Someotherrace)OwnerLargeMSAMO/MB0.810.010.36-0.38-3.62-2.612.191.14MediumMSAMO/MB0.30-0.120.46-0.28-4.39-0.330.661.81 SmallMSA&Non-MSAMO/MB-0.250.140.440.302.292.612.092.71AllotherTEAs1.84-1.111.340.850.56-0.160.151.59NonownerLargeMSAMO/MB1.821.02MediumMSAMO/MB0.612.83 SmallMSA&Non-MSAMO/MB2.453.61 AllotherTEAs1.644.08Domain4(Non-HispanicBlack)OwnerLargeMSAMO/MB1.63-1.31MediumMSAMO/MBSmallMSA&Non-MSAMO/MB0.070.46AllotherTEAsNonownerLargeMSAMO/MB4.183.42MediumMSAMO/MBSmallMSA&Non-MSAMO/MB2.640.12AllotherTEAsDomain3 (Hispanic)OwnerLargeMSAMO/MB1.460.04MediumMSAMO/MBSmallMSA&Non-MSAMO/MB1.661.08AllotherTEAsNonownerLargeMSAMO/MB3.524.98MediumMSAMO/MBSmallMSA&Non-MSAMO/MB4.8810.74AllotherTEAsDomain5(NativeHawaiianor PacificIslander)
Owner 2.71 Nonowner 6.58Domain6(Non-HispanicAsian)
Owner 0.55 Nonowner 1.58 AmericanIndianor Alaska NativeDomain1 (On Reservation)
Owner 5.04 Nonowner 4.10Domain2(Off Reservation)
Owner 1.60 Nonowner 5.57*ForCensus2000,personscanself-identifywithmorethanoneracegroup.Forpost-stratificationpurposes,personsareincludedinasingleRace/HispanicOriginDomain.Thisclassificationdoesnotchangeapersonsactualresponse.Further,allofficialtabulationsarebasedonactual responsestothecensus.Anegativenetundercountdenotesanetovercount.SectionIChapter77-17DualSystemEstimationU.S.CensusBureau,Census2000 Table7-7.Census2000A.C.E.64Post-StratumGroups-StandardErroroftheNetUndercountin PercentRace/Hispanicorigindomainnumber*TenureMSA/TEAHighreturnrateLowreturnrateNEMWSWNEMWSWDomain7(non-HispanicWhiteor Someotherrace)OwnerLargeMSAMO/MB0.430.360.87-0.451.051.431.542.09MediumMSAMO/MB0.85-0.280.420.381.520.841.102.79SmallMSA&Non-MSAMO/MB1.330.400.430.573.602.121.081.49AllotherTEAs1.060.390.971.662.171.210.651.89NonownerLargeMSAMO/MB0.631.01MediumMSAMO/MB0.711.24SmallMSA&Non-MSAMO/MB0.511.24AllotherTEAs0.941.67Domain4(Non-HispanicBlack)OwnerLargeMSAMO/MB0.561.24MediumMSAMO/MBSmallMSA&Non-MSAMO/MB1.071.86AllotherTEAsNonownerLargeMSAMO/MB0.661.05MediumMSAMO/MBSmallMSA&Non-MSAMO/MB0.962.08AllotherTEAsDomain3 (Hispanic)OwnerLargeMSAMO/MB0.521.26MediumMSAMO/MBSmallMSA&Non-MSAMO/MB1.012.09AllotherTEAsNonownerLargeMSAMO/MB0.671.12MediumMSAMO/MBSmallMSA&Non-MSAMO/MB1.554.12AllotherTEAsDomain5(NativeHawaiianor PacificIslander)
Owner 3.83 Nonowner 4.07Domain6(Non-HispanicAsian)
Owner 0.87 Nonowner 0.98 AmericanIndianor Alaska NativeDomain1 (On Reservation)
Owner 1.45 Nonowner 1.42Domain2(Off Reservation)
Owner 1.95 Nonowner 2.02*ForCensus2000,personscanself-identifywithmorethanoneracegroup.Forpost-stratificationpurposes,personsareincludedinasingleRace/HispanicOriginDomain.Thisclassificationdoesnotchangeapersonsactualresponse.Further,allofficialtabulationsarebasedonactualresponsestothecensus.Anegativenetundercountdenotesanetovercount.7-18SectionIChapter7DualSystemEstimationU.S.CensusBureau,Census2000 Attachment.TheEffectofMoversonDualSystemEstimationThisattachmentdiscussestheeffectofmoversonDualSystemEstimation(DSE).ThreealternativemethodologiesforhandlingmoversinDSEhavebeenconsideredbytheU.S.CensusBureau.Historically,theyarereferredtoas PES-A,PES-B,andPES-C.However,thecurrentterminologyistorefertothemasProceduresA,B,andC.FollowingarethedefinitionsofthesemethodologiesasdescribedinU.S.
BureauoftheCensus(1985).ProcedureA.Thisprocedurereconstructsthehouse-holdsastheyexistedatthetimeofthecensus.Arespon-dentisaskedtoidentifyallpersonswhowerelivingor stayinginthesamplehouseholdonCensusDay.Thesepersonsarethenmatchedagainstnamesonthecensusquestionnaireforthesampleaddress(andsurrounding area).Fromthisinformation,estimatesofthenumberand percentmatchedfornonmoversandoutmoverscanbe made.ProcedureB.Thisprocedureidentifiesallcurrentresi-dentslivingorstayinginthesamplehouseholdatthe timeoftheinterview.Therespondentisaskedtoprovidetheaddress(es)wherethesepersonswerelivingorstayingonCensusDay.Thesepersonsarethenmatchedagainst namesoncorrespondingcensusquestionnaire(s)atthenonmoversorinmoverscensusaddress.Estimatesofthenumberandpercentmatchedfornonmoversandinmov-erscanbemade.ProcedureC.Thisprocedureidentifiesallcurrentresi-dentslivingorstayingatthesampleaddressatthetimeoftheinterviewplusallotherpersonswholivedatthe sampleaddressonCensusDayandhavemovedsince CensusDay.However,onlytheCensusDayresidents(non-moversandoutmovers)arematchedwiththecensusquestionnaire(s)atthesampleaddress.Estimatesofthe numberofnonmovers,outmovers,inmovers,andtheper-centmatchedfornonmoversandoutmovers,canthenbemade.Estimatesofnonmoversandmoverscomefrom ProcedureBandmatchrateestimatesforthemoversfromProcedureA(usingoutmovermatching).Thus,ProcedureCisacombinationofProceduresAandB.In1990,ProcedureBwasused.Theunresolvedmatchrateforinmoversin1990washigh,around13percent.Inaddition,withsamplingfornonresponseinitiallyplannedforCensus2000,inmovermatchingwouldhavehadanevenhigherlevelofdifficulty.Adecisionwasmadethat ProcedureBwouldNOTbeusedforCensus2000.WhentheSupremeCourtdecidedagainstsamplingforappor-tionment(nosamplingfornonresponse),itwastoolatetochangethedecisiononProcedureB.Inthe1995and1996Censustests,ProcedureAwasused.TheU.S.CensusBureaureasonedthatanoutmovermatchratewouldbemoreaccuratethananinmovermatchrate,particularlywithsamplingfornonresponse.Foroutmovers,interviewersattemptedtoobtainthenames,newaddressesandotherdatathatcouldbeusedformatchingfromthenewoccupantsorneighbors.Then anattemptcouldbemadetotracethepeopletoobtainan interviewwithahouseholdmember.ThebestavailabledataforoutmoverswasmatchedtotheirCensusDayaddressesinthesamemannerasforthenonmovers.Outmovertracinghadproblemsin1995andwastestedin1996andintheCensus2000DressRehearsal.Theout-movertracingevaluationbyRaglinandBean(1999) showedthatthereislittlegaininanoutmovertracingoperation.AdecisionwasmadetousetheoutmoverproxyinterviewdataforoutmovermatchingforCensus
2000.ProcedureCwastestedintheCensus2000DressRehearsalanditwasusedinCensus2000(Schindler, 1999).TheadvantageofProcedureCisthattheestimateofthenumberofmoversusesinmoverdata,whichismorereliablesinceitiscollectedfromtheinmoversthem-selves.Thematchrateofthemoversisestimatedusingtheoutmovermatchratesothatthedifficultiesofinmovermatchingareavoided.Outmovertracingisaproblem, however,andinmanycasesitisnecessarytouseproxy dataformatching.TherewasnooutmovertracingforCensus2000.ProcedureCattemptstoobtainaProcedureBestimatewithnoinmovermatching.ProcedureCand ProcedureBestimatesaredifferentsinceoutmoversdonothavethesamematchrateasinmovers.However,thedisadvantageoftheProcedureBinmovermatchrateesti-mateisthatitmayyieldahighpercentageofunresolved cases.SectionIChapter77-19DualSystemEstimationU.S.CensusBureau,Census2000 Chapter8.Model-BasedEstimationforSmallAreas INTRODUCTIONThischapterdocumentstheAccuracyandCoverageEvalu-ation(A.C.E.)methodologyofsyntheticestimationfor smallareasincludingtheestimationofsamplingvariancesofsyntheticestimatesandthegeneralizationofthevari-ances.Syntheticestimationistheparticularmodelused forcoverageadjustmentforsmallareasforA.C.E.First,thesyntheticestimationmethodologyandtheimpliedmodelaredescribed.Then,themethodologyforestimat-ingsamplingvariancesofthesesyntheticestimatesandforgeneralizingthesevariancesarediscussed.SYNTHETICESTIMATIONMETHODOLOGYFORSMALLAREAS BackgroundAsdiscussedinChapter7,dualsystemestimates(DSE)andcoveragecorrectionfactorswerecalculatedatthe post-stratumlevel.ThesearedirectA.C.E.Surveyesti-mates,basedonlyondatafromsampleunitsinthepost-stratum.However,censuscountsadjustedforcoverageerroraredesirableforsmallgeographicareasmuch smallerthananypost-stratumsuchasblocks,tracts,countiesandcongressionaldistricts.Theadjustedcountswereexpectedtoimprovedatausedforcongressional redistrictingaswellasstates,mostmetropolitanareas, andlargercountiesandcitiesandtoprovideconsistenttotalswhencensusdataareaggregatedovermanysmallareas.ManyoftheseareasdonotincludeanyA.C.E.
sampleunits,makingadirectestimateimpossible(seeChapter3fordetailsofA.C.E.sampling).ThegeographicareasthatincludeA.C.E.sampleunitsonlyhaveasmall numberofsampleunits.Adirectestimatewouldresultinunacceptablylargestandarderrors.SyntheticestimationisdiscussedinGhoshandRao(1994),Gonzalez(1973),and GonzalezandWaksberg(1973).Gonzalez(1973)describes syntheticestimationasfollows:Anunbiasedestimateisobtainedfromasamplesurveyforalargearea;whenthisestimateisusedtoderiveestimatesforsubareasunder theassumptionthatthesmallareashavethesamecharac-teristicsasthelargearea,weidentifytheseestimatesassyntheticestimates.Syntheticestimationwasfirstused bytheNationalCenterforHealthStatistics(1968)tocalcu-latestateestimatesoflongandshorttermphysicaldis-abilitiesfromtheNationalHealthInterviewSurveydata (GhoshandRao,1994).Syntheticestimationisauseful procedureforsmallareaestimation,mainlyduetoitssim-plicityandpotentialtoincreaseaccuracyinestimationbyborrowinginformationfromsimilarsmallareas.SyntheticestimationwasusedforCensus2000toprovideadjustedpopulationestimatesforsmallgeographicareas suchasblocks,tracts,counties,andcongressionaldis-tricts.Theseblock-levelestimatescanthenbeaggregatedtoanygeographiclevel.Thesyntheticestimatesproviderevisedpopulationcountsforbothallpersonsandper-sons18andover.CountsarealsoprovidedforHispanicor Latinopersonsbyrace(63categories)andNotHispanicorLatinopersonsbyrace(63categories)forboththetotalpopulationandthepopulation18yearsandover.Forexample,countsofsingle-raceAsianpersonswhoareNotHispanicorLatinoaregivenforboththetotalpopulationandthepopulation18yearsandover.Countsofsingle-raceAsianswhoareNotHispanicorLatinowhoareless than18yearsofagecanbeobtainedbysubtraction.Syntheticestimatesareformedbycombiningcoveragemeasurementresultswithcensuscountstoproducepopu-lationestimatesforanygeographicareaofinterest.Forexample,ablock-levelsyntheticestimateisformedbydistributingapost-stratumscoveragecorrectionfactorto blocksproportionaltothesizeofthepost-stratumspopu-lationwithintheblock.Rounded,adjustedsyntheticesti-matesatthetabulationblocklevelconstitutetheadjusted redistricting 1datafile.Thesyntheticestimationmodelassumesthatcoveragecorrectionfactorsareuniformwithinagivenpost-stratum,meaningthatthecoverageerrorrateforagivenpost-stratumisthesamewithinallblocks.Totheextentthat thesyntheticassumptionisincorrect,theestimatesofcoverageforindividualareasarebiasedand,hence,soarethepopulationsizeestimatesbasedonthecoverage correctionfactors.Syntheticestimationbiasdecreasesasthesizeofthegeographicareaincreases.SyntheticEstimationThissectiondescribesthecalculationofsyntheticesti-mates.Syntheticestimationincludesacontrolledroundingprocedureusedtoproduceestimatesthatareinteger-valued.Thevisualrepresentationofthetwelvestepsin thecontrolledroundingprocessgiveninHaines(2001)is providedhere.
1SinceitwasoriginallyintendedthattheA.C.E.mightbeusedtoadjustcensuscountsforredistricting,suchdataiscalledredistrictingdata,althoughitwasnotultimatelyusedforthat purpose.SectionIChapter88-1Model-BasedEstimationforSmallAreasU.S.CensusBureau,Census2000 CalculationConsiderformingsyntheticestimatesforgeographiclevelgforagivenpost-stratum.LetC i,gdenotethecensuscountforpost-stratumiingeographiclevelganddefine
CCF itobethecoveragecorrectionfactorforpost-stratumi.Thegeneralformforasyntheticestimateforpost-stratumiatgeographiclevelgiscalculatedas Ni , g SC i , gCCF i.Aggregatingsyntheticestimatesoverallthepost-strataingeographiclevelgyieldsasyntheticestimateforthetotal populationofgeographiclevelg.Thisisdenotedas Ng si C i , gCCF i.Onepurposeofsyntheticestimationandthecontrolledroundingprocedureistoproduceinteger-valuedadjustedsyntheticestimatesatthetabulationblocklevel.Then, summingoverdifferentgeographieswithinalargerarea yieldsthesameestimateasthatforthelargergeographicarea.Theseestimatescomprisetheadjustedredistrictingdatafile.GeographyComponentsofsyntheticestimatesusetwoslightlydiffer-entorganizationsofgeography.Bothcollectionandtabu-lationblocksareusedinthesyntheticestimationprocess.
Acollectionblockisageographicareausedduringcensusdata-collectionactivities.TheHundred-PercentCensusEditedFile(HCEF)isbasedoncollectionblockgeography.
Tabulationblocks,ontheotherhand,aregeographicareas usedfortabulatingcensusdata.TheHundred-PercentDetailFile(HDF)isbasedontabulationblockgeography.Syntheticestimationcensuscountsarebasedontabula-tionblockgeographywhilethecoveragecorrectionfac-torsassociatedwithpost-strataarebasedoncollectionblockgeography.Thiscouldhaveramificationsonvari-ableswithageographiccomponent,althoughanysucheffectsareprobablysmall.Forexample,considerthepost-stratificationvariablereturnrate.Returnratewascalculatedatthetractlevel andbasedoncollection-tractdefinitions.Peoplewereassignedtopost-stratabasedonthereturnrateoftractsdefinedusingcollectionblocks.Nowconsiderthecase wherepeopleareassignedtopost-stratabasedonthe returnrateoftractsdefinedusingtabulationblocks.Itcouldbethecasethatthechangeingeographycausesanindividualspost-stratumassignmenttochange.For example,supposethereturnrateofacollection-tractis80percentandthatthecollectiontractissplitintotwopiecesbyatabulation-tract.Apersonwhobelongedtothe collection-tract(withan80percentreturnrate)maynow belongtoatabulation-tractwithadifferentreturnrate.Changesinanindividualspost-stratumwouldalsocausechangesinthedualsystemestimates,coveragecorrection factors,andsyntheticestimates.Toavoidpotentialincon-sistenciesintheassignmentofpeopletopost-strata,there wasonlyoneassignmentofpeopletopost-strata.The assignmentwasbasedoncollection-blockgeography, whichwasconsistentwiththegeographyusedinthe A.C.E.Further,thispost-stratificationassignmentwas maintainedforallestimationpurposes.ControlledRoundingSyntheticestimatesatanygeographiclevelarenottypi-callyinteger-valued.Acontrolledroundingprogram, developedbytheStatisticalResearchDivision(SRD)oftheU.S.CensusBureau,wasutilizedthatproducesinteger-valuedestimates.Thetheoryofcontrolledroundingis giveninCoxandErnst(1982).Theproblemisrepresentedasatransportationtheoryproblemtominimizeanobjec-tivefunctionthatmeasuresthechangeduetocontrolled rounding.Inessence,thecontrolledroundingprogramtakesatwo-dimensionalmatrixofnumbersandroundseachtoanadjacentintegervaluebasedonanefficiency algorithm.Anoptimalsolutionthatminimizesthechangeduetocontrolledroundingisguaranteed;therecan,how-ever,bemorethanoneoptimalsolution.Thetwodimen-sionsofthematrixare:1)thepost-strataforonelevelof geography;and2)totalsforalowerlevelofgeography.Thecontrolledroundingprocedureensuresthatthesumofthesyntheticestimateswithinageographiclevelare roundedupordownbyanamountstrictlylessthanone person.Theoverallgoalofcontrolledroundingwastoobtainanintegernumberofpersonsforeachpost-stratumiwithin eachtabulationblockb,reflectingtheestimatesofover-countandundercount.Thecontrolledroundingprogramcouldnotbeimplementedinonestepduetothesizeof thepost-stratabytabulationblockmatrix.Asaresult,controlledroundingwasimplementedinstepssuchthattherounded,adjustedsyntheticestimatesforblocks sumto:*therounded,adjustedsyntheticestimatesfortracts,
- therounded,adjustedsyntheticestimatesforcounties, and*therounded,adjustedsyntheticestimatesforstates.Inotherwords,theblock,tractandcountyrounded,adjustedsyntheticestimateswouldallbeconsistentwitheachother.Also,thestate-levelsyntheticestimatesare adjustedinordertoguaranteethattotalpopulationesti-matesatthestatelevelsumtothenationaltotalpopula-tionestimate.AcontrolledroundingprocedurefortheU.S.canbeimple-mentedasfollows:8-2SectionIChapter8Model-BasedEstimationforSmallAreasU.S.CensusBureau,Census2000 1.Formtheratioofthecontrol-roundeddualsystemestimate(DSE R)totheunroundedDSEforpost-stratumi.Itiswrittenas DSE i R DSE i2.Foreachpost-stratumiwithinstates,multiplythestate-levelsyntheticestimatebytheratioformedinstep1.ThesuperscriptASdenotesanadjustedsyn-theticestimate.Theresultingproductistheadjustedsyntheticestimateforpost-stratumi,withinstateswrittenas Ni , s ASNi , s S[DSE i RDSE i]where Ni , s SC i , sCCF i.3.Applythecontrolledroundingproceduretotheadjustedstate-levelsyntheticestimatestoproducerounded,state-levelsyntheticestimates,denoted Ni , s RS.ThesuperscriptRSdenotesarounded,syntheticesti-mate.Thetwodimensionsofthismatrixarestatesbypost-stratumi.4.Calculatetheratiooftheroundedstate-levelsyntheticestimatetothestate-levelsyntheticestimateforpost-stratumiinstates.5.Foreachpost-stratumiwithincountycforstates,multiplythecounty-levelsyntheticestimatebytheratioformedinstep4.Theresultingproductisthe adjustedcounty-levelsyntheticestimateforpost-stratumi,writtenas Ni , c ASNi , c S[Ni , s RSNi , s S]where Ni , c SC i , cCCF i.6.Applythecontrolledroundingproceduretotheadjustedcounty-levelsyntheticestimatestoproducerounded,adjusted,county-levelsyntheticestimates, denoted Ni , c RS.Thetwodimensionsofthismatrixarecountyc(instates)bypost-stratumi(instates).7.Formtheratiooftherounded,adjusted,county-levelsyntheticestimatetothecounty-levelsyntheticesti-mateforpost-stratumiincountycinstates.8.Foreachpost-stratumiwithintracttincountycforstates,formtheproductofthetract-levelsynthetic estimateandtheratioformedinstep7.Thisresultsin theadjustedtract-levelsyntheticestimateforpost-stratumi,writtenas Ni , t ASNi , t S[Ni , c RSNi , c S]where Ni , t SC i , tCCF i.9.Applythecontrolledroundingproceduretotheadjustedtract-levelsyntheticestimatestoproduce rounded,adjustedtract-levelsyntheticestimates, denoted Ni , t RS.Thetwodimensionsofthismatrixaretractt(incountycinstates)bypost-stratumi(in countycinstates).10.Calculatetheratiooftherounded,adjustedtract-levelsyntheticestimatetothetract-levelsyntheticestimateforpost-stratumiintracttincountycinstates.Post-stratumiState12..i..1 Ni , s AS 2: s:Post-stratumi State12..i..1 Ni , s RS 2: s:Post-stratumiinstates County12..i..1 Ni , c AS 2: c:Post-stratumiinstates County12..i..1 Ni , c RS 2: c:Post-stratumiincountycinstatesTract12..i..1 Ni , t AS 2: t:Post-stratumiincountycinstatesTract12..i..1 Ni , t RS 2: t:SectionIChapter88-3Model-BasedEstimationforSmallAreasU.S.CensusBureau,Census2000 11.Foreachpost-stratumiwithinblockbintracttincountycforstates,multiplytheblock-levelsyn-theticestimatebytheratioformedinstep10.The resultingproductistheadjustedblock-levelsyn-theticestimateforpost-stratumi,writtenas Ni , b ASNi , b S[Ni , t RSNi , t S]where Ni , b SC i , bCCF i.12.Again,applythecontrolledroundingproceduretotheadjustedblock-levelsyntheticestimatestopro-ducerounded,adjustedblock-levelsyntheticesti-mates,denoted Ni , b RS.Thetwodimensionsofthismatrixareblockbintracttincountycinstatesbypost-stratumiintracttincountycinstates.RecordReplicationforCoverageCorrectionOncetherounded,adjustedblock-levelsyntheticesti-mateswereformed,theywerecomparedwiththecensus countsforpost-stratumiintabulationblockb.Person recordswerethenreplicatedatthepost-stratumleveltoreflectthecoveragecorrectionforthecensusblocks.Noattemptwasmadetoplacethesepersonsinhouseholds.
Thus,forexample,thenumberofpersonsperhouseholddoesnotchangeduetocoveragecorrection.Thenumberofrecordsreplicateddependsonthevalueofthecoverage correctionfactorthatisreflectedintheroundedsyntheticestimateforpost-stratumiandtabulationblockb.CoverageCorrectionFactors1Whenthecoveragecorrectionfactorforpost-stratumiwasgreaterthanone,undercountpersonrecordswererepli-catedtoreflecttheundercountinpost-stratumiandblock basfollows:
U i , bNi , b RSC i , b If U i , b=0,thennoadditionalrecordswerenecessary.If U i , b>0,thenwereplicated U i , bundercountpersonrecordsforpost-stratumiintabulationblockb.Undercountpersonrecordswerereplicatedbyrandomlyselectingwithoutreplacement U i,brecordsfromthe C i,bavailablepersonrecordsinpost-stratumiandtabulationblockb.Theselectedrecordswerereplicatedandappendedtothefileofpersonrecords.Theundercount personrecordforeachofthereplicatedrecordswasgiven aneffectiveweightof+1fortabulations.Thisresultedin anupwardadjustmentofpeopleinpost-stratumiin tabulationblockb.CoverageCorrectionFactors<1Whenthecoveragecorrectionfactorforpost-stratumiwaslessthanone,overcountpersonrecordswerereplicatedtoreflecttheovercountinpost-stratumiandblockbas
follows: O i , bC i , bNi , b RS If O i , b>0,then O i , bovercountpersonrecordswererepli-catedforpost-stratumiintabulationblockb.Overcountpersonrecordswerereplicatedbyrandomlyselectingwithoutreplacement O i , brecordsfromthe C i,bavailablepersonrecordsinpost-stratumiandtabulationblockb.Theselectedrecordswerereplicatedand appendedtothefileofpersonrecords.Theovercountper-sonrecordforeachofthereplicatedrecordswasgivenaneffectiveweightof-1fortabulations,resultinginadown-wardadjustmentofpeopleinpost-stratumiintabulation blockb.VARIANCEESTIMATIONFORSMALLAREASEstimatingtheerrorduetosamplingforanypublishedestimateisapolicyoftheCensusBureau.Thispolicyappliestosyntheticestimatesaswellasthemoretradi-tionalestimates.Duetothelargenumberofestimatesat lowerlevelsofgeography,itisnotfeasibletoprovide tableslistingthestandarderrorofeachpublishedesti-mate.Instead,aparameter,thegeneralizedcoefficientofvariation(GCV),isprovided,thatallowsuserstoapproxi-matethestandarderrorforanydesiredestimate.Thecoefficientofvariationofanestimateissimplytheratiooftheestimatesstandarderrortotheestimateitself.Smallareavarianceestimationisatwo-stepprocess.Thefirststepconsistsofproducingdirectvarianceestimatesforthesyntheticcountestimatesforsmallareassuchascensustracts.Thisprocessisexplainedundertheheading DirectVarianceEstimates.Thesecondstepistomodelthe directvarianceestimatesusingthegeneralizedcoefficientofvariation,orGCV.ThismethodisexplainedundertheheadingGeneralizedVarianceEstimates,alongwithan
example.Variancescalculatedforsmallareasdonotaccountforallsourcesofsyntheticerror;theyonlyreflectvariationsdue tosampling.Syntheticpopulationbiascanexistsincethesamecoveragecorrectionfactorsareappliedtoareaswithdifferentnetcensuscoverage.SeeGriffinandMalec (2001)fordetailsonestimatingsyntheticbias.Inmostverysmallgeographicareassuchasblocksandtracts,thePost-stratumiintracttincountycinstates Block12..i..1 Ni , b AS 2: b:Post-stratumiintracttincountycinstates Block12..i..1 Ni , b RS 2: b:8-4SectionIChapter8Model-BasedEstimationforSmallAreasU.S.CensusBureau,Census2000 biasesarelikelytobetheprincipalsourceoferrors.Sam-plingerrorsdominatethetotalerrorforlargerareassuch asstates,metroarea,etc.Biasinthepost-stratum-level dualsystemestimatescanstemfrommatchingbias,data collectionerrors,andcorrelationbias,amongother sources.Bell(2001)investigatesandestimatescorrelation biasintheA.C.E.dualsystemestimatesbycomparing themtoresultsfromDemographicAnalysis.DirectVarianceEstimatesDuringthepost-stratum-levelA.C.E.varianceestimationoperation,avariance-covariancematrixoftheA.C.E.cov-eragecorrectionfactors(CCFs)wasproduced.Theesti-matedvarianceofanysyntheticpopulationestimatecanbecomputedusingthismatrixandtheunadjustedcensus counts,brokendownbypost-stratumandexcludingout-of-scopepersonsintheA.C.E.SeeStarsinic(2001)fordetails.Asynthetichouseholdpopulationestimate(Group Quarterspersonsarenotincluded)fortracttiswrittenas Xtpost-stratah Xthh1 416 C thCCF h'where C thisthefinal,unadjustedcensuscountforpost-stratumhintractt.Therewere416post-stratausedtoestimatecoverage.Thevarianceforthesynthetichouseholdpopulationesti-mate Xt is V arXtV arh1 416 Xthh1 416h'1 416 C ovXth ,Xth'h1 416h'1 416 C ovC thCCF h ,C th'CCF h'h1 416h'1 416 C thC th'C ovCCF h,CCF h'.Foragivendataitemjintractt,thesyntheticvarianceforthesynthetichouseholdpopulationestimate Xjt isexpressedas V arXjth1 416h'1 416 C jthC jth'C ovCCF h,CCF h',(1)where C jthisthefinal,unadjustedcensuscountfordataitemjinpost-stratumhintractt.Herehandh'refertoparticularpost-strataandjreferstoadataitem.GeneralizedVarianceEstimatesThegeneralizedcoefficientofvariation(GCV)isthevari-anceestimationmethodologyusedforestimatingvari-ancesofadjustedredistrictingdataandforestimatesofadjustedpopulationcountsforthethousandsofgeo-graphicareasthatcanbetabulatedusingsyntheticesti-mation.Foragivencountinaparticularstate,thecoeffi-cientofvariation(CV)wascalculatedforalltractsinthatstatethathadpopulationintheparticulardemographic category.TheCVofanestimateisestimatedastheratioofthestandarderroroftheestimatetotheestimateitself, i.e.CVX=SEXX.Thestandarderrorinthenumeratoristhesquarerootofthevarianceestimatefrom(1).Tractscomposedentirelyofpersonsout-of-scopefortheA.C.E.samplehadnosam-plingvariance(andthereforeaCVof0)andwereremoved fromtheprocessing.Alsoremovedweretractswithavery smallpopulationinthedemographiccategory,asthesewereshownintheCensus2000DressRehearsalanalysistohaveadisproportionatedownwardeffectontheparam-eters.Theprocessofremovingtractswascontrolledtopreventremovinganoverlylargefractionofsmalltractsforanyadjusteddemographicdataitem.Inaddition,outli-erswereidentifiedusingtherelativeabsolutedeviation(RAD)statisticforeachdataitemj.TractswithaRADvalueabovethecutoffvaluewereremovedandanew GCVwascomputedusingCVsofremainingtracts.There werefouriterationsofidentifyingandremovingoutliers.Ofthe286uniquedemographiccategories,GCVswerecalculatedforthe50statesandtheDistrictofColumbia foreachofthe56largestcategoriesand4additionalcatch-allgroups.TheaverageofthedirectCVsfordataitemsinastateisaGCVparameter.Thestate-levelGCVparameterscanthenbeusedtoestimatethestandarderrorofadataitemfor allgeographicareaswithinthatstate.Considerthefollow-ingtableofGCVparametersforagivenstate.SectionIChapter88-5Model-BasedEstimationforSmallAreasU.S.CensusBureau,Census2000 StateParametersforCalculatingtheStandardErrorofA.C.E.-AdjustedDataDemographiccategoryAllpersonsNotHispanicorLatinoAllages18andoverAllages18andoverGCVGCVGCVGCVAllpersons
...............................................0.00630.00670.00660.0069HispanicorLatino..
.......................................0.01060.0115XXPopulationofonerace
....................................0.00640.00670.00660.0069Whitealone
............................................0.00730.00770.00810.0083BlackorAfricanAmericanalone
..........................0.00730.00830.00730.0083AmericanIndianandAlaskaNativealone
.................0.01430.01470.01880.0190Asianalone..
..........................................0.00800.00850.00810.0086NativeHawaiianandOtherPacificIslanderalone...
.......0.03910.04950.05070.0545SomeOtherRacealone
.................................0.01090.01190.01260.0139Populationoftwoormoreraces
............................0.00700.00770.00710.0082Populationoftworaces
...................................0.00710.00780.00710.0082White;BlackorAfricanAmerican
.........................0.01030.01560.01030.0157White;AmericanIndianandAlaskaNative
.................0.00880.00920.00960.0100White;Asian
...........................................0.01160.01310.01200.0133BlackorAfricanAmerican;AmericanIndianandAlaska Native................................................0.01290.01400.01280.0140Asian;NativeHawaiianandOtherPacificIslander
..........0.05240.05600.05300.0566Allothercombinationsoftwoormoreraces
.................0.00880.00950.00880.0099Supposeadatauserisinterestedincalculatingthestan-darderrorofthepopulationestimateofallAsiansina givencounty.ThedatauserwouldlocatetheGCVparam-eterthatcorrespondstotheAsianalonedemographic categoryandtheAllpersons,Allagesclassificationin theappropriatestatetable.Forthetableabove,theGCV parameteris0.0080.NowassumethatthepopulationestimateforallAsiansinthiscountyis370people.Usersareinstructedtousetheformula SEX=GCVX,tocalculatetheestimatedsyntheticstandarderror,yield-ing0.0080x370=2.96,orabout3peopleinthisexample.Similarcalculationscanbedoneforanygeo-graphiclevelanddemographiccategory.8-6SectionIChapter8Model-BasedEstimationforSmallAreasU.S.CensusBureau,Census2000 AppendixA.Census2000MissingData INTRODUCTIONTheCensusBureauusedimputationinthe2000DecennialCensus,asithasinpriorcensuses,toaddresstheprob-lemofmissing,incomplete,orcontradictorydata,an inevitableaspectofcensusesandsurveys.Itisimpossiblenottohavemissingdatainanendeavorasmassiveandcomplexasadecennialcensus.InCensus2000,theCen-susBureauprocesseddataforover120millionhouse-holds,includingover147millionpaperquestionnairesand1.5billionpagesofprintedmaterial.Inthe2000Cen-sus,thevarioussituationsthatresultedinmissingdataincludedincompleteorunavailableresponsesfromhous-ingunitswithpreviouslyconfirmedaddresses,conflicting dataaboutthesamehousingunit,andfailuresinthedata-captureprocess.Thevarioustypesofmissingdataincludedcharacteristicdata(informationaboutanenumer-atedperson,suchassex,race,age),populationcount data(informationaboutthenumberofoccupantsinanidentifiedhousingunit),andhousingunitstatusdata(whethertheunitisvacant,occupied,ornonexistent).The 2000Censususedtwoprimarytypesofimputation.1.Thefirsttype,calledcountimputation,isimputationofthenumberofoccupantsofahousingunit.Count imputationapplieswhentheCensusBureauisunable tosecureanyinformationregardingagivenaddress,orwhentheCensusBureauhaslimitedinformationabouttheaddressanddoesnothavedefinitiveinfor-mationonthenumberofoccupants.2.Thesecondtype,calledcharacteristicimputation,isimputationthatsuppliesmissingcharacteristicdataforahousingunitsresponse,butdoesnotinvolvethe numberofoccupantsforahousingunit.Forexample,ifagivenhousingunitdidnotprovideagesfortheindividualslivinginthehousingunit,butsuppliedall otherinformation,agewouldbeimputedfortheindi-vidualsinthathousingunit.Sometimesthehouseholdsizeisknownforthehousingunit;however,noneof thecharacteristicsaboutthepeopleareknown.Inthis caseallofthepersonscharacteristicsareimputed.
1Thisappendixsummarizesthemethodsusedtoimputethesetypesofmissingdatainthecensus.Somesummarystatisticsshowingthedegreeofimputationforthesecat-egoriesisgiveninthelastsectionofthisappendix.
BACKGROUNDThecensusdatacollectionactivitiesstartedaroundmid-Marchof2000,throughthemailordirectlyusingcensus-enumerators.FromJunetoSeptember,censusstaffcon-ductednonresponsefollow-up(NRFU)andcoverageimprovementfollow-up(CIFU)operationstorevisit addressesforwhichcensusreportswerenotcompleted,i.e.didnotrespondtomailout/mailbackorearlyenumera-tionoperations.Basedontheresultsoftheseoperations, theCensusBureauwasabletodesignatemorethan99.5percentofhousingunitrecordsasoccupied,vacant,ornonexistenthousingunitaddresses.Todesignatean addressasvacantornonexistentrequiredatleasttwoindependentcensusoperations.Thiswastoensurecom-pletecensuscoverage.Thenonexistenthousingunits wereaddressesofplacesusedonlyfornonresidentialpur-poses,orplacesthatwereuninhabitableandwerenotincludedinthecensuscounts.Topermittheproductionofcensuspopulationcounts,itwasnecessaryforeachcensusaddresstohaveastatusofoccupied/vacant/nonexistentandahouseholdsizeifoccupied.Topermittheproductionoftheredistrictingfile andothermoredetailedcensusproducts,itwasalsonec-essarytohaveinformationabouteachpersonsuchasage,race,andsex.Thecountimputationcoveredstatusfor housingunitswithundeterminedstatusandhousehold sizeforoccupiedunitswithanunknownnumberofoccu-pants.Thecharacteristicimputationwasusedtofillinthemissingpersondata.CensushousingunitsidentifiedintheAccuracyandCov-erageEvaluation(A.C.E.)blockclustersweredefinedastheE-samplehousingunits.Personsresidinginthese housingunitswereE-samplepersons.Ittookseveraldif-ferentcensusoperationstoestablishalistofcensushous-ingunitrecordsandalistofcensuspersonrecords.One oftheseoperationswasthecreationofaHundredPercentCensusUneditedFile(HCUF).Atthehousingunitlevel,allhousingunitsdesignatedasoccupiedorvacantthrough datacollectionorthroughimputationwereincludedinthe HCUF.ThefilewasusedasasourcefiletoidentifytheE-samplehousingunitsfortheA.C.E.operations.Atthepersonlevel,theHCUFwasusedasasourcefileforper-sonmatchingbetweenthecensusandtheA.C.E.(How-ever,thisdoesnotincludeimputedpersons,sincetheywerenotsenttoA.C.E.matching).Chapter3provides detailedinformationonE-sampleidentification,while Chapter4providesinformationonpersonmatching.
1Thisdoesnotincludegeographiccharacteristicssuchasloca-tion,urbanorruralresidencyetc.,whicharegenerallyknownforallhouseholds.SectionIAppendixAA-1Census2000MissingDataU.S.CensusBureau,Census2000 Personsimputedtoanoccupiedunitwithanunknownnumberofoccupantsorpersonswithalltheircharacteris-ticsimputedwereconsideredasnon-data-definedpersons inthepersonDualSystemEstimation(DSE).Fordata-definedpersons,characteristicimputationfilledincensus missingdata,suchassex,age,ethnicity,andowner
/renterstatusforpersonDSEpoststratificationpurposes.COUNTIMPUTATIONTheCensusBureauusedcountimputationforthreecat-egoriesofcasesinCensus2000.
1.Householdsizeimputation.TheCensusBureauimputedthenumberofoccupantsforahousingunitwhenCensusBureaurecordsindicatedthatthehous-ingunitwasoccupied,butdidnotshowthenumber ofindividualsresidingintheunit.
2.Occupancyimputation.WhenCensusBureaurecordsindicatedthatahousingunitexisted,butnot whetheritwasoccupiedorvacant,theCensusBureauimputedoccupancystatus(occupiedorvacant).Iftheunitwasimputedtobeoccupied,thehouseholdsize wasalsoimputed.
3.Statusimputation.WhentheCensusBureausrecordshadconflictingorinsufficientinformation aboutwhetheranaddressrepresentedavalid,nondu-plicatedhousingunit,theCensusBureaufirstimputedforthestatusoftheunit(occupied,vacant,nonexist-ent),then,ifoccupied,thehouseholdsizewas imputed.MethodologyTheCensusBureauusedthenearest-neighborhotdeckimputationmethodologytoperformthecountimputation.
Nearestwasdefinedbythegeographicalclosenessofhousingunits.Groupquartersaddresseswereincludedinthemeasureofdistance,althoughnototherwiseinvolved incountimputation.Censusgeographicalidentifiers,such astractnumber,blocknumber,ormapspotnumber,alongwithstreetname,housenumberorapartmentnumberwereusedtodescribegeographicalproximityofhousing records.Toproperlyassignstatusandnumberofoccu-pantstothehousingunitsrequiringimputation,limiteddonorpoolsandexpandeddonorpoolsweredeveloped foreachimputationcategory,whichwerefurthersubdi-videdbytypeofstructure.Allcaseswithmissingstatus,occupancyorhouseholdsizewentthroughintensivefollow-upoperationstoreducetheamountofimputationasmuchaspossible.
ThiswasthemainpurposeoftheNRFUandCIFUformailout/mailbackareasandenumeratorvisitinlist/enumerateorupdate/enumerateareas.Toproperly representthesecases(donees),theprimarydonorpools werealsohousingunitsfromNRFU,CIFUorfromother enumeratorvisitedcases.Inthedesignphase,theCensus Bureaudiddevelopastandbyproceduretoincludeallenu-merationsinanexpandeddonorpool.With99.5percent ofhousingunitshavingstatusandhouseholdsizeinfor-mationavailablefromdatacollectionactivities,the expandeddonorpoolswereneverused.Thechartbelow characterizestherelationshipbetweendoneesandthepri-marydonorpoolbyimputationcategory.DonorsandDoneesbyImputationCategory ImputationcategoriesDoneesDonorpoolHouseholdsize imputation:a.Singleunits b.MultiunitsOccupiedwithunknownhouseholdpopulationOccupiedunitswithknownpopulation(inNRFU,or CIFU,orfromlist/enumerate orupdate/enumerateareas)
Occupancy imputation:a.Singleunitsb.MultiunitsUnitsknowntobeeitheroccupiedorvacantOccupiedunitswithknownpopulationorvacantunits (inNRFU,orCIFU,orfromlist/enumerateorupdate/enumerateareas)StatusImputation:a.Singleunitsb.MultiunitsUnitswithnostatus informationOccupiedunitswithknownpopulation,vacantunits,ornonexistentunits(in NRFU,orCIFU,orfrom list/enumerateorupdate/
enumerateareas)Ingeneral,typeofstructure(multiorsingle),typeofenu-meration(mailorlist/enumerate),andfinalstageofthedatacollectionforahousingunit(initialcollection,NRFU, orCIFU)determinedwhetherahousingrecordcouldbeusedasaprimarydonor.Eachavailabledonorcouldonlybeusedonce.Mostofthetime,thenearestpotential donorwasselectedasthedonor.Occasionally,asecond nearestneighborwasdesignatedasthedonor,becausethenearestdonorhadbeentakenbysomepreviouslypro-cesseddonee.Wheneverpossible,thedonoranddonee weretobeinthesametract,orinthesamemultiunitifthedoneewaslocatedinamultiunitbuilding.Toidentifythenearestdonor,asearchwasconductedinbothdirections:forwardandbackward.Usingthedonee asareferencepoint,potentialdonorssurroundingthedoneerecordweresearched,andthedonorhousingunitgeographicallyclosesttothedoneehousingunitwas determined.Thesearchwasdoneseparatelyforsingleunitsandmultiunits.A-2SectionIAppendixACensus2000MissingDataU.S.CensusBureau,Census2000 CHARACTERISTICIMPUTATIONCharacteristicimputationwastheprocessoffillinginmissingpersoncharacteristics,whichincludesex, age/dateofbirth,relationship,Hispanicoriginandrace.
TheCensusBureauusedcharacteristicimputationfor threecategoriesofcasesinCensus2000.
1.Wholehouseholdimputation.TheCensusBureauimputedallofthecharacteristicsforallofthepersons inthehouseholdwhenthehouseholdrecorddidnotcontainanydatadefinedpersons.Tobedata-defined,apersonrecordmustcontaintwoormoreofthe100-percentpopulationdataitems,oraname.
2.Withinhouseholdimputation.TheCensusBureauimputedallthe100-percentcharacteristicsforany non-data-definedpersonsinthehouseholdwhenthehouseholdrecordcontainedatleastonedata-defined person.3.WithinPersonImputation.Sometimessomeofthe100-percentcharacteristicdatafordata-definedper-sonsweremissingandwereimputed.
MethodologyThecategoriesofcharacteristicimputationemploydiffer-entmethodologies.Forwholehouseholdimputations,the processreplicatesallofthe100-percentpersondataitems bysubstitutingdatafromahotdecknearestneighbordonorpoolrecordofthesamehouseholdsize.Thispro-cessissometimesreferredtoassubstitution,sinceit assignsallthecharacteristicsforallofthepersonsintheselecteddonorhouseholdtothehouseholdrequiringimputation.Thissubstitutionprocessisalsousedto obtainthepersoncharacteristicsforthosehousingunits thatwereimputedasoccupiedorhadtheirhouseholdsizeimputedduringthecountimputationprocess.Bydefini-tionthesehouseholdsdonotcontainanydata-defined persons.However,themajorityofwholehouseholdimputationsoccurforcaseswhereacensusresponseonhouseholdsizewasobtained.Forwithinhouseholdimputationsaswellaswithinpersonimputations,theprocessallocatesmissingvaluesforindi-vidualpersoncharacteristicdataitemsonthebasisof otherreportedinformationforotherpersonsinthehouse-hold,orfromotherpersonsinhouseholdswithsimilar
characteristics.
RESULTSThissectionbrieflysummarizestheoveralllevelofimpu-tationforpeoplewhose100-percentcharacteristicswere totallyimputedinCensus2000(withinpersonimputa-tionsareexcluded)fortheU.S.populationresidinginhousingunits.Census2000HousingUnitPersonsbyImputationCategory(Excludeswithinpersonimputations)Numberof personsPercentoftotalpersonsTotalhousingunitpopulation.
.........273,643,272100.00100-percentcharacteristicimputationnotrequired
............267,869,00797.89100-percentcharacteristicimputationrequired...
............5,774,2662.11Countimputations:
...................1,172,1440.43Householdsize
....................495,6000.18 Occupancy........................260,6520.10 Status............................415,8920.15Characteristicimputations
.............4,602,1221.68Wholehousehold 1.................2,269,0100.83Withinhousehold
..................2,333,1120.85 1Thecountimputationcases(alsorequiringcharacteristicimputa-tion)arenotincludedinthisfiguretoavoidduplication.About2percentofpersonsresidinginhousingunitsrequiredimputationsofall100-personcharacteristics.Themajorityofthesecases,about1.7percent,occurredin situationswhereacensusresponseonhouseholdsizewasobtained.Lessthanahalfofapercentweresituationswherehouseholdsizeorthestatusofthehousingunit wasunknown.SectionIAppendixAA-3Census2000MissingDataU.S.CensusBureau,Census2000 AppendixB.DemographicAnalysis INTRODUCTIONTheCensusBureauhasusedDemographicAnalysis(DA)tomeasurepopulationcoverage,trendsbetweencen-suses,anddifferencesincoveragebyage,sex,andrace (Black,non-Black)atthenationallevelineverycensussince1960(SiegelandZelnik(1966),Siegel(1974),Fayetal.(1988),andRobinsonetal.(1993)).DAproducesesti-matesoftheU.S.populationthroughtheuseofdatafromadministrativerecordsandothernoncensussources.Ithasdocumentedboththelong-termreductioninthecen-susnetundercountrateandthepersistentanddispropor-tionateundercountofcertaindemographicgroups,suchasBlackmen.OnegoalofCensus2000wastoreduce thesedifferentialundercounts,whichhasbeenacontinu-ingeffortforthelastseveralcensuses.Theindependencefromthecensusandinternalconsis-tencyoftheDAestimationprocessallowsustocompare theresultswiththesurvey-basedAccuracyandCoverageEvaluation(A.C.E.)coverageestimates;inparticular,theconsistencyoftheage-sexresultscanbeassessed.DA andA.C.E.useentirelydifferentmethodologies.Becausethesourcesandpatternsoferrorsinthetwoestimatesaresufficientlydifferent,anydisagreementintheresultscan shedlightonboththequalityofthecensusandpotential problemsinmethodologyintheA.C.E.ortheDA.Becauseofdatalimitations,DAestimatesandcomparisonsareonlypossibleatthenationallevelandforcertainlarge demographicgroups.AfurtherdiscussionofDAlimita-tionsisfoundinthesectionLimitationsofDAEstimatesofthisappendix.TheU.S.CensusBureaureleasedtwosetsofDAresultsaspartofitsevaluationofCensus2000andtheA.C.E.AllDAresultsinthissectionarefromtherevisedvaluesreleased inOctober2001.SeeRobinson(2001)fordetails.DESCRIPTIONOFTHEDEMOGRAPHICANALYSIS METHODDemographicAnalysisrepresentsamacro-levelapproachformeasuringcoverage.Estimatesofnetundercountareobtainedbycomparingcensuscountstoindependentesti-matesofthepopulationderivedfromothermeasures(mostlyadministrativedata).Ingeneral,DApopulationestimatesaredevelopedforthecensusdatebycombining varioustypesofdemographicdatathatareindependentofthecensusandarehighlyreliable,suchasadministra-tivestatisticsonbirths,deaths,andMedicaredataandestimatesofimmigrationandemigration.Thedifference betweentheDAestimatedpopulation(P)andthecensuscount(C)providesanestimateofthenetcensusunder-count(u).DividingthenetundercountbytheDAbench-markprovidesanestimateofthenetundercountrate(r):
uPC ruP100Theparticularanalyticprocedureusedtoestimatecover-agenationallyforthevariousdemographicsubgroupsdependsprimarilyonthenatureandavailabilityoftherequireddemographicdata.Twodifferentdemographic techniqueswereusedtoproducethedemographicanaly-sisestimatesfor2000,oneforthepopulationunderage65andanotherforthepopulation65andover.Agesunder65.TheDemographicAnalysisestimatesforthepopulationbelowage65arebasedonthecompilationofhistoricalestimatesofthecomponentsofpopulation change:birthssince1935(B),deathstopersonsbornsince1935(D),immigrantsbornsince1935(I),andemi-grantsbornsince1935(E).Presumingthatthecompo-nentsareaccuratelymeasured,thepopulationestimates (P 0-64)arederivedbythebasicdemographicaccountingequationappliedtoeachbirthcohort:
P 064BDIEThesizeofthecomponentestimatesusedtodeveloptheDApopulationunderage65for2000isshownin TableB-1:TableB-1.DAEstimatesoftheComponentsofChangefortheU.S.Resident Population:April1,2000ComponentEstimateTotalpopulation
.................................
281,759,858Underage65in2000+Birthssince1935(B)
.......................
234,860,298-Deathstopersonsbornsince1935(D)
.......14,766,736+Immigrationofpersonsbornsince1935(I)
....32,563,971-Emigrationofpersonsbornsince1935(E)
....5,485,117Ages65andoverin2000Medicare-basedpopulation
....................
34,587,440SectionIAppendixBB-1DemographicAnalysisU.S.CensusBureau,Census2000 Clearly,births(234.9million)representbyfarthelargestcomponent.Theimmigrationcomponent(32.6million)is secondlargest,followedbydeaths(14.8million)andemi-grants(5.5million).Theactualcalculationsarecarriedoutforsingle-yearbirthcohorts.Forexample,theestimateofthepopulationage40onApril1,2000isbasedonbirthsfromApril1959toMarch1960(adjustedforunder-registration),reducedby deathstothecohortineachyearbetween1959and2000,andincrementedbyestimatedimmigrationandemigra-tionofthecohortoverthe40-yearperiod.Thecomponentsforbirthsanddeathsarecompiledprinci-pallyfromvitalstatisticsrecordsaugmentedbycorrectionfactors.Theimmigrationcomponentisestimatedfromits
subcomponents:TableB-2.DAEstimatesoftheComponentsofImmigrationfortheU.S.Resident PopulationUnder65YearsofAge:
April1,2000ComponentEstimateLegallyadmittedpermanentresidents
.............
20,332,038Othermeasuredmigration
.......................
2,249,001MigrantsfromPuertoRico..
...................
905,698Temporarymigrants
...........................
776,002Civiliancitizenmigration
.......................
891,940ArmedForcesoverseas.
......................
-324,639Residualforeign-bornmigration(includesunautho-rizedmigrants)
..................................
9,982,932Age65andover.AdministrativedataonaggregateMedicareenrollmentsareusedtoestimatethepopulation age65andover(P 65+): P 65MmwhereMistheaggregateMedicareenrollmentandmistheestimateofunderenrollmentinMedicare.TheDApopulation65andoverisbasedon2000Medicareenroll-ments.Medicareisanadministrativedatasetfromthe HealthCareFinancingAdministration.AlthoughMedicareenrollmentisgenerallypresumedtobequitecomplete,adjustmentsaremadetothebasicdatatoaccountfor individualswhoareomitted.Anallowanceismadefortheestimated1.3millionnotenrolled(3.9percent).Underenrollmentfactorsarebasedonsurveyestimatesof MedicarecoverageanddataonageatenrollmentintheMedicarefile.TheDApopulationaged65andover(34.6million)represents12.3percentofthetotalpopulationin
2000.Thedemographiccomponentestimatesforthepopulationunder65arecombinedwiththeMedicare-basedestimate forthepopulation65andovertoproducethetotalDA populationestimateof281.8millionasofApril1,2000.LIMITATIONSOFDAESTIMATESDAestimatesforthetotalpopulationareavailableonlyatthenationallevelandonlyforthebroadcategoriesBlackandnon-Black.DAcannotprovideestimatesforsub-nationalgeographicareaslikestatesormetropolitanareas;orforotherdemographicgroups,suchasHispan-ics.DAalsocannotprovideseparateestimatesforcensusovercoverageandundercoverage,butislimitedtoesti-matingnetundercount.TherearealsocertaininherentlimitationsonDAestimatesbecauseofdataquality.Theracecategoriesreflecttheraceasassignedatthetimeoftheevent(e.g.birthor Medicareenrollment),whichforsomepersonswilldifferfromtheracereportedinthecensus.Thereisalsoconsid-erableuncertaintyinthequalityofthedataforsomeofthecomponentsrelatedtoimmigration,mostimportantlythecomponentswhichcapturethosewhoenteredillegally ortemporarily,orwhoselegalstatushadnotyetbeen determined.DAESTIMATESComparedtotheCensus2000countof281.4million,theDAestimateof281.8millionimpliesanetcensusunder-countof0.12percent(seeTableB-3).Thenetcensus undercountin2000wasdramaticallydifferentfromthatin1990,whichwas4.2million,or1.65percent.However,thefactthatDAprovidesonlyanetundercountestimate, notseparatemeasuresofgrossundercountandover-count,isalimitationonitsabilitytoshedlightonspecificundercoverageorovercoverageproblemsinthecensus.TableB-3.DemographicAnalysisEstimateandNetCensusUndercountforthe TotalPopulation:1990and2000 Category 1990 Census 2000 CensusDA(millions)
..........................252.9281.8DifferencefromCensus................4.20.3Percentdifference
.....................1.650.12B-2SectionIAppendixBDemographicAnalysisU.S.CensusBureau,Census2000 TheDAestimatesindicatethatthesubstantialreductioninnetcensusundercountfrom1990to2000wassharedby almostalldemographicgroups.Thenetcensusunder-countofmalesandfemaleseachfellbyabout1.5percent-agepoints(toanestimatednetcensusundercountof0.86 percentformalesandestimatednetcensusovercountof 0.60percentforfemalesin2000).Theestimatednet undercountratedroppedmoreforBlacks(estimatednet censusundercountof2.78percentin2000)thannon-Blacks(estimatednetcensusovercountof0.29percentin 2000),reducingthedifferentialundercountofBlacksrela-tivetonon-Blacksfrom4.4percentagepointsin1990to 3.1pointsin2000.TableB-4.DemographicAnalysisEstimatesofPercentNetCensusUndercountfor theTotalPopulationandSelected DemographicGroups:1990and
2000Category1990DA2000DATotal..................................1.650.12 Male.................................2.390.86 Female...............................0.93-0.60 Black................................5.522.78Non-Black..
..........................1.08-0.29Blackmale,ages20-64................11.318.44Children,ages0-4
.....................3.723.84(aminussigndenotesanetovercount)SectionIAppendixBB-3DemographicAnalysisU.S.CensusBureau,Census2000 AppendixC.WeightTrimming INTRODUCTIONThisappendixcontainsageneraloverviewoftheAccu-racyandCoverageEvaluation(A.C.E.)weighttrimmingplan.Theprocedurewasdesignedtoprotectagainst undueinfluencefromasmallfractionofthesample.Theweighttrimmingcriteriawereestablishedpriortothecompletionofdataprocessingoperationstoensurethat therewasnomanipulationofthedualsystemestimates.Thisprocedurewasimplementedaccordingtothepre-specifiedcriteria.Sinceonlyoneclusterwastrimmed,the impactonthedualsystemestimateswasveryminimal.TheA.C.E.weighttrimmingprocedurewasdesignedtoreducethesamplingweightsforclustersthatpotentially couldhavehadanextremeinfluenceonthedualsystemestimatesandvariances.Themeasureofclusterinfluencewasthenetclustererror,theabsolutedifferencebetween theweightedestimateofnonmatchesandtheweighted estimateoferroneousenumerations.Whentheneterrorexceededapre-setmaximumvalue,thesamplingweightswerereduced.Thisapproachreducedvarianceandmay haveintroducedsomebias,butitishighlylikelytohavereducedthemeansquareerrorformostitems.Iftheneterroroftheclusterdidnotexceedthepre-setmaximum value,thesamplingweightswereunchanged.TheneterrorcriteriawasexaminedaftertheA.C.E.personmatchingoperationwascompleted.Ifthecriteriafor weighttrimmingwasmet,itwasdoneforallsamplecases inaclustereventhoughaclustercontributedsampletomultiplepost-strata.Thiswasdonepriortothemissingdataprocess.
BACKGROUNDWeighttrimmingguardsagainstthepossibilityofacertainsmallnumberofclustersexertinganundueinfluenceonpost-stratumestimatesandvariances.IntheA.C.E.,theseareexpectedtobeduetoadisproportionatenumberof censusnonmatchesorcensuserroneousenumerations withinafewblockclusters.Althoughextremesamplingweightscanalsobeasourceofinfluenceinsurveys,theA.C.E.samplingweights,theinverseoftheprobabilityof selection,werereasonablycontrolledinthesampledesign.ThesewerenotexpectedtobeanimportantsourceofvarianceintheA.C.E.WhiletheA.C.E.sampledesignhelpedminimizetheoccur-renceofhighlyinfluentialclusters,aweighttrimmingplanwasdevelopedtoreducetheeffectofthesepotentialextremeclusters.TheA.C.E.weighttrimmingplanwasamodificationofthemethodusedforthe1990Post-EnumerationSurvey(PES).Asin1990,theweightsfor extremelyinfluentialclustersweretrimmedtoyieldapre-specifiedneterror.Theintentionoftheplanwastolessen theimpactofsuchclustersonthedualsystemestimates andvariances.Theplandidnotredistributetheweightsacrosstheremainingclusterstopreservetotals.Thiswouldimply treatingtheEandPsamplesdifferentlytopreservethese separatetotals,andcontradictsthepreferenceforconsis-tenttreatmentsofbothsamples.Sincetheprimaryinter-estwasinthedualsystemestimationratios,andnotE-andP-sampletotals,theweightswerenotredistributed.A.C.E.WEIGHTTRIMMINGMETHODOLOGYEachclusterwasevaluatedtodetermineifitcontributeddisproportionatelytothedualsystemestimatesandvari-ances.Iftheclusterwasanoutlier,theclustersampling weightwasmultipliedbyafactortodecreasetheinflu-enceoftheclusteronthedualsystemestimatesandvari-
ances.IdentifyOutlierClustersAmeasureoftheclusterinfluencewascalculatedforeachcluster.Then,basedonpre-setcriteria,adecision wasmadewhethertheclustershouldbeidentifiedasan outlier.ClusterInfluence.Themeasureofclusterinfluencewastheneterror.Forpurposesofweighttrimming,thenet errorwastheabsolutedifferencebetweentheweighted numberofnonmatchesandtheweightednumberoferro-neousenumerations.Theformoftheweightedneterror
was Z il(P i-M i)-(E i-CE i)l(1)where Z itheneterrorestimateforclusteri, P itheweightedP-samplepopulationestimateforclusteri, M itheweightedP-samplematchestimateforclusteri, E itheweightedE-samplepopulationestimateforclusteri,and CE itheweightedE-samplecorrectenumerationestimateforclusteri.Thefirsttermofequation(1)wastheweightednumberofnonmatchesinthei thcluster,whilethesecondtermwastheweightednumberoferroneousenumerationsinthei thcluster.SectionIAppendixCC-1WeightTrimmingU.S.CensusBureau,Census2000 OutlierCriteria.Theoutliercriterionwasthemaximumallowableneterrorforasinglecluster.Thereweretwodif-ferentcriteriabasedontheclustergeography.Thenation wasclassifiedintotwolevelsofgeography:American IndianReservationsandthebalanceofthenation.The AmericanIndianReservationclustersweresampledatdis-proportionatelyhigherratesrelativetothebalanceofthe country.Inaddition,separateAmericanIndianonAmeri-canIndianReservationpost-stratumestimateswere planned.IftheAmericanIndianReservationclusterswere includedwiththerestofthenation,itisunlikelythatan influentialclusterwouldbedetected.Thetwooutliercrite-riaaredefinedinTableC-1.TableC-1.OutlierClusterCriteriaClustergeography MaximumneterrorAmericanIndianReservations
....................
6,250BalanceoftheUnitedStates
.....................
75,000Allclusterswithneterrorgreaterthanthemaximumallowableneterrorwereconsideredinfluentialclusters.
Theywereexpectedtodisproportionatelyinfluencethedualsystemestimatesandvariances.Thesamplingweightsoftheseclustersweredecreased.Themaximumneterrorforthebalanceofthecountrywasbasedonexperienceinthe1990PES.SincetheA.C.E.was roughlydoublethePESsamplesize,themaximumallow-ableneterrorwassettobehalfthe1990value.FortheAmericanIndianReservationclusters,themaximumallow-ablevaluewasafunctionoftheaveragesamplingrates.TheAmericanIndianReservationaverageP-sampleclustersamplingweightwasapproximatelyone-twelfththebal-anceoftheU.S.averageP-sampleclustersampling weight.Becauseofthis,theAmericanIndianReservationmaximumallowableneterrorwasone-twelfththebalanceoftheU.S.criteria.ImplementationStrategy.Theoutlierclusterswereidentifiedafterthepersonmatchingoperation(Chapter4)wascompleted,butbeforethemissingdataprocess (Chapter6).Thepersonmatchingresultswerethemajorinputintothisprocess.Theweighttrimmingestimateusedthebestestimateofclusterneterroratthattimethat wasoperationallyfeasible.Thistiminghadseveralimpli-cations:*Onlynonmoversandoutmoverswereusedforderivingtheestimateofnonmatchesabove.Fordualsystemesti-mation,ifthenumberofoutmoversinapost-stratumwaslessthan10thenonlythenon-moversandoutmoverswereused.Becauseofthesmallnumber ofmoversexpectedinmostclusters,thisprocess onlyusednonmoversandoutmovers.*Somenonmoversandoutmovershadunresolvedmatchstatusandresidencestatus.SomeE-samplecaseshadunresolvedenumerationstatus.Thismeantthestatusof unresolvedcaseshadtobeestimatedtoidentifyoutlierclusters.Informationavailableatthetimeoftheweighttrimmingprocesswasusedtoapproximatelyestimate theunresolvedstatuscases.Sincetheweighttrimmingprocesswasdonebeforethemissingdataprocess,therewassomeinformationthatthemissingdatapro-cessusedtoestimateunresolvedstatusthatwasnotyetavailable.*AP-samplenoninterviewadjustmentwasapproximatedintheestimateofnonmatches.Informationavailableduringtheweighttrimmingprocesswasusedto approximatelyestimatethenoninterviewadjustmentfor eachcluster.Aswiththeunresolvedcases,sincetheweighttrimmingprocesswasdonebeforethemissingdataprocess,therewassomeinformationthatthemiss-ingdataprocessusedtodothenoninterviewadjust-mentthatwasnotyetavailable.*Thetargetedextendedsearchresultsandsamplingrateswerereflectedintheestimateofnonmatchesanderroneousenumerations(Chapter5).Down-WeightingOutlierClusterAlloutlierclustersweredown-weighted,sothatnoclustercontributedmorethanthemaximumallowablenumberof neterrorsfortheappropriategeography.Aseparatedown-weightingfactorwascomputedforeachoutliercluster.Thedown-weightingfactorwastheratioofthe outlierclustercriteriatotheclusterneterrorcomputed above.D iC Z i (2)where D ithedown-weightingfactorforclusteri,and CthemaximumneterrorfromTableC-1fortheappropriatelevelofgeography,and Z itheneterrorestimateforclusterifrom(1).Theclusterdown-weightingfactorwasappliedtotheP-sampleandtheE-sampleweightsoftheoutliercluster.
TheP-sampleandE-sampleweightsfortheremainingclusterswereunchanged.C-2SectionIAppendixCWeightTrimmingU.S.CensusBureau,Census2000 A.C.E.WEIGHTTRIMMINGRESULTSTableC-2showstheoneclusterdown-weightedbytheweighttrimmingprocessinthebalanceoftheUnited States.NoclustersweretrimmedonAmericanIndian
Reservations.FiguresC-1andC-2showthedistributionsofneterrorbeforeweighttrimmingforthebalanceoftheUnited StatesandtheAmericanIndianReservationareas.TableC-2.A.C.E.WeightedNetErrorsforDown-WeightedClusterGeographicareaBeforetrimming After trimming Estimated erroneous enumerations Estimated weighted nonmatches Estimated weightedneterror Estimated weightedneterrorBalanceoftheUnitedStates
......79,3711,39677,97575,000SectionIAppendixCC-3WeightTrimmingU.S.CensusBureau,Census2000 Figure C-1.
Distribution of Net Error for the Balance of the United StatesU.S. Census Bureau, Census 2000 C-4 Section IAppendix C Weight Trimming (Number of clusters) 2,461 7,610 510 133 45 24 14 12320020001010,00020,00030,00040,00050,00060,00070,00080,000Net error before trimming Figure C-2.
Distribution of Net Error for American Indian ReservationsU.S. Census Bureau, Census 2000 Section IAppendix C C-5Weight Trimming (Number of clusters) 186 262 510 29 3 2 1 11100000001,0002,0003,0004,0005,0006,0007,000Net error before trimming AppendixD.ErrorProfileforA.C.E.Estimates INTRODUCTIONTheAccuracyandCoverageEvaluation(A.C.E.)surveypro-videdestimatesofcensuscoverageerrorthathavebeen consideredforadjustingCensus2000.TheestimationusedthePES-CversionofdualsystemestimationwiththedatacollectedbytheA.C.E.Theadjustedestimatesare subjecttononsamplingerror,aswellassamplingerror.ThisappendixdiscussesthetypesoferrorsfoundintheuseofPES-Candthemeasurementoftheseerrors.OVERVIEWOFADJUSTEDESTIMATESDefinethefollowingnotationforeachpost-stratumh.
C hcensuscountforpost-stratumh II hnumberofpersonsimputedintotheoriginalenumerationforpost-stratumh IE, hestimatednumberofenumerationsinpost-stratumhwithinsufficientinformationformatching 1 EE, hestimatednumberoferroneousenumerationsinpost-stratumh NE, hestimatedpopulationsizeforpost-stratumhfromtheEsample CEhestimatedpopulationsizeforpost-stratumhwhocouldpossiblybematched CEhNE, hIE, hEE, h NP, hestimatedsizeoftheP-samplepopulation MhestimatednumberoftheP-samplepopulationenu-meratedinthecensusThedualsystemestimatorforthepopulationsize N h inpost-stratumhisdefinedby NhC hII h)(CEhNE, h)(NP, hMh).The2000A.C.E.usedthePES-Cformulationofthedualsystemestimatorwhichusesthenumberofinmoverstoestimatethenumberofoutmovers,butusesthematchratefortheoutmoverstoobtaintheestimateofthenum-berofoutmoversthatmatchthecensus.Thepost-stratumindexhissuppressedinthefollowingformula.(NPM)NnNiMn(MoNo)Ni).where Nnestimatednumberofnonmovers Noestimatednumberofoutmovers Niestimatednumberofinmovers Mnestimatednumbernonmoversenumeratedinthe census Moestimatednumberoutmoversenumeratedinthe censusWhenapost-stratumhadfewerthan10outmovers,thePES-Aversionofthedualsystemestimatorthatdoesnotuseinmoverswasemployedasfollows: (NP/M)(NnNo)/(Mn+Mo)Theadjustmentfactorforpost-stratumhisdefinedas AhNhC h.Theunadjustedestimateforareajis Nunadj,jhCh,jandtheadjustedestimateis Nadj,jhAh Ch,j.Theestimatesofundercountinthepopulationsizeofareajis Nadj,jNunadj,jandtheestimateofthecorre-spondingundercountrateisNadj,jNunadj,jNadj,j.SOURCESOFERRORINADJUSTMENTSTheadjustedestimatesaresubjecttoavarietyofpossiblesourcesoferror:samplingerror,datacollectionandsur-veyoperationserror,missingdata,errorfromexclusionoflatecensusdataanddatawithinsufficientinformationfor matching,contaminationerror,correlationbias,syntheticestimationbias,inconsistentpost-stratification,andbal-ancingerror.P-SampleMatchingErrorandE-SampleProcessing Error Source.ThetermP-samplematchinghasbeenusedtodescribethesearchofthecensusrecordsforenumera-tionsforP-samplerespondents.TheP-samplerespondents aredesignatedasmatchinganenumerationinthecensus orasnotenumerated.ThecounterpartfortheEsampleiscalledE-sampleprocessingwherecensusenumerationsaredesignatedascorrectlyenumeratedorerroneously enumerated.WhenthestatusofaP-sampleorE-samplecasecannotbedetermined,itisdesignatedasunre-solved.P-samplematchingerrorreferstotheneteffectoferrorsthatoccurduringtheprocessingthataffectthedeterminationwhetheraP-samplepersonmatchesacen-susenumeration.Likewise,theneteffectoferrorsin assigningenumerationstatustoE-sampleenumerationsduringtheofficeprocessingiscalledE-sampleprocessingerror.1Lateenumerationsareincludedwithimputationsintheoriginalenumeration.SectionIAppendixDD-1ErrorProfileforA.C.E.EstimatesU.S.CensusBureau,Census2000 Errorsmayoccurineitherdirection.P-samplepeoplemaybedesignatedasmatchingacensusenumeration althoughtheyarenotinthecensus,calledafalsematch, orpeoplemaybedesignatedasnotenumeratedalthough theyare,calledafalsenonmatch.E-sampleenumera-tionsmaybefalselyassignedacorrectenumerationsta-tus,calledafalsecorrectenumeration,orenumerations maybeincorrectlydesignatedasanerroneousenumera-tion,afalseerroneousenumeration.MatchingerroralsoencompasseserrorsinthesizeoftheP-samplepopulationthatmayhappenduringtheprocess-ingoftheP-sample.Theseerrorsalsomayoccurineitherdirection.Apersonincludedasamemberofahousehold mayreallyresideatanotherlocationornotbeinthepopulationofinterest.Forexample,thecensusresidencyrulesconsiderfamilymembersawayatcollegetoresideattheircollegeaddress.Afamilymemberinanursingcenter isconsideredtobeinthegroupquarterspopulation,whichisnotpartofthepopulationofinterest.Viceversa,apersonwithtwohomes,maybedesignatedaslivingat theotherhome,butreallyliveattheoneinthesample.IntheapplicationofPES-C,respondentshavethepotentialofmanymorestatusesthanwaspossibleintheprocess-ingoftheP-samplethanin1990.ThereasonisthataP-samplerespondentmaybeanonmover,anoutmover,aninmover,oranout-of-scopeperson.Thenonmoversand outmovershaveanothercharacteristic,whichisresidentornonresident.ApersonwhoislivingatthesampleaddressonCensusDayiscalledaresident.Errorsinmoverstatusmaygoinalldirections.Apersondesignatedasanonmovermaybeaninmoveroranout-mover.Allcombinationsoferrorsmayhappenandaffect theDSEindifferentways.
Definition.P-samplematchingerroraffectsboththeestimatesofnonmoversandinmoversintheestimateof thesizeoftheP-samplepopulation.Inaddition,matchingerroraffectstheestimatesofthenumberofnonmovermatches,thenumberofoutmoversandoutmover matches,andthenumberofinmoversintheestimateofthenumberofmatches.E-sampledatacollectionerroraffectstheestimateofthenumberoferroneousenumera-tions.Thepost-stratumindexhissuppressedinthefol-lowingdefinitions.
m nmsnetP-samplematchingerrorinthenonmovercom-ponentof Mm omimsnetP-samplematchingerrorintheoutmoverandinmovercomponentof Mn nmsnetP-samplematchingerrorinthenonmovercom-ponentof NP n imsnetP-samplematchingerrorintheinmovercom-ponentof NP ee snetE-sampleofficeprocessingerrorinCE Undertheassumptionthatallothererrorsarezero,thebiasintheadjustmentfactorscausedbyP-samplematch-ingerrorandE-sampleprocessingerrorisdefinedas Bprocess , hAhC hII h C hCEhBCEprocess , h NE, h[NP, hBPprocess , h]MhBMprocess , h.Theerrorcomponentdefinitionsincludearatioadjust-mentbecausetheyareestimatedusingtheEvaluationSample.Theratioadjustmentforcomponentsfromthe P-sampleistheratiooftheP-samplepopulationtotalfrom theA.C.E.totheP-samplepopulationtotalbasedontheEvaluationSample NPF.Theratioadjustmentforthecom-ponentsfromtheE-SampleisratioofthetwoE-sample totalsdefinedcomparably.
BMprocess[m nmsm omims][NPNPF]BPprocess[n nms][NPNPF]BCE-process[ee s][NENEF]P-SampleandE-SampleDataCollectionError Source.Errorsmayoccurduringthedatacollection.Whileaninterviewisinprogress,therespondentmaymakeanerrorinansweringaquestion,ortheinterviewermaymakeanerrorinaskingaquestionorrecordingthe answer.Errorsalsooccurwhenaninterviewergoestothewrongaddress.Regardlessofwhethertheerroriscausedbytherespondent,theinterviewer,oracombinationof thetwo,sucherrorsmaycausethematchingoperationto assignmoverstatus,residencystatus,ormatchstatusincorrectlytoapersononthehouseholdroster.TheA.C.E.interviewercollectsbothaCensusDayrosterandanInter-viewDayroster.Apersonwhoresidesatthehouseholdonbothdaysisclassifiedasanonmover.ApersonwholivedthereonlyonCensusDayisanoutmover,whileaperson wholivedthereonlyonInterviewDayisaninmover.Per-sonsclassifiedasoutmoversandnonmoversmayormaynothavebeenaresidentattheaddressonCensusDay.Errorsinthemoverstatus,residencystatus,orother errorsmaycausethematchingoperationtofailtodeter-minethatapersonwasenumeratedandtoclassifythepersonasanonmatchincorrectly.Sometimespeoplelistedonhouseholdrostersdonotexist.Amorelikelyscenarioisaninterviewerwhoishav-ingtroublecontactingtheresidentsofahousingunits maycopythenamefromamailboxandfillinthecharac-teristics.ThistypeoferroriscalledP-samplefabrication.Usuallyfabricatedhouseholdscauseanunderestimateof thematchrate,becausetheyaresmallerthantheaverage householdsizeanddonotmatch,AspecialtypeofE-sampledatacollectionerroristhefail-uretoidentifyduplicateenumerations.Theprocessing includesasearchforduplicateenumerationswithintheblockclusterandthesurroundingblocks.Duplicateenu-merationsoutsidetheblockclusterandsurroundingD-2SectionIAppendixDErrorProfileforA.C.E.EstimatesU.S.CensusBureau,Census2000 blocksaremoredifficulttofind.Identifyingtheseduplica-tionsrequirestherespondenttoprovideinformationcon-cerninganotheraddresswhereahouseholdmembermayalsobeenumerated.Errorsmayoccurwhentherespon-dentdoesnotunderstandtheresidencyrulesoris unawarethatahouseholdmembermaybeenumeratedat anotheraddress.Thesituationsmostpronetocausing duplicateenumerationsarecollegestudentsenumerated attheirfamilyhomeandtheircollegeaddress,childrenin jointcustodyagreementsenumeratedatbothparents addresses,andpeoplewithtworesidences.AnothertypeoffielderroroccursduringthelistingofthehousingunitsforthecensusorfortheP-sample.Thehousingunitslistedasbeinginthesampleblockmaybeinanotherblockorviceversa.Thesetypesoferrorsare calledgeocodingerror.Toaccountforminorgeocodingerrorsin2000,thesearchformatchesoccurredwithinallblock-clustersandalsoinsurroundingblocksforasample ofthecaseswithgeocodingerrorsrecordedintheE sampleadesigncalledTargetedExtendedSearch(TES).ThevarianceestimatesfortheA.C.E.accountfortheTESdesign.FlawsintheexecutionoftheTESmayresultin
biases.Definition.P-samplefabricationanddatacollectionerroraffectboththeestimatesofnonmoversandinmoversin theestimateofthesizeoftheP-samplepopulation.Inaddition,fabricationanddatacollectionerroraffecttheestimatesofthenumberofnonmovermatches,thenum-berofoutmoversandoutmovermatches,andthenumberofinmoversintheestimateofthenumberofmatches.E-sampledatacollectionerroraffectstheestimateofthe numberoferroneousenumerations.Again,thepost-stratumindexhissuppressedinthefollowingdefinitions.
m nmrnetP-sampledatacollectionerrorinthenon-movercomponentof Mm omimrnetP-sampledatacollectionerrorintheoutmoverandinmovercomponentof Mn nmrnetP-sampledatacollectionerrorinthenonmovercomponentof NP n imrnetP-sampledatacollectionerrorintheinmovercomponentof NP ee rnetE-sampledatacollectionerrorin CEm nmfpnetP-samplefabricationerrorinthenonmovercomponentof Mm omimfpnetP-samplefabricationerrorintheoutmoverandinmovercomponentof Mn nmfpnetP-sampledatacollectionerrorinthenonmovercomponentof NP n imfpnetP-sampledatacollectionerrorintheinmovercomponentof NPUndertheassumptionthatallothererrorsarezero,thebiasintheadjustmentfactorscausedbyP-sampledata collectionerrorandE-sampledatacollectionerrorisdefinedas Bcollect , hAhC hII h C hCEhBCEcollect , h NE, h[NP, hBPcollect , h]MhBmcollect , h.Theerrorcomponentdefinitionsincludearatioadjust-mentbecausetheyareestimatedusingtheEvaluation Sample.TheratioadjustmentforcomponentsfromtheP-sampleistheratiooftheP-samplepopulationtotalfromtheA.C.E.totheP-samplepopulationtotalbasedontheEvaluationSample.Theratioadjustmentforthecompo-nentsfromtheE-SampleisratioofthetwoE-sampletotalsdefinedcomparably.
BMcollect[m nmrm nmfpm omimrm omimfp][NPPF]BPcollect[n nmrn nmfpn im][N PNPF]BCEcollect[ee r][NENEF]MissingData Source.A.C.E.datamaybemissingforavarietyofreasonssomeA.C.E.interviewsfailtotakeplace,somehouseholdsprovideincompletedataonquestionnaire items,andinsomecasestheinformationforclassificationasamatchornonmatchisambiguous.Themethodsusedtocompensateformissingdataeffectivelyassumethat thematchstatusforthecasewithmissingdataisequal onaveragetothestatusforcasesthataresimilar,exceptthattheyhavecompletedata.Missingdataoncharacteris-ticsareimputedfromotherwisesimilarcaseswithcom-pletedata.Nonresponseweightingadjustmentsareusedtoaccountforsampledbutnoninterviewedhouseholds.TheP-samplematchingandE-sampleprocessingoperation assignsunresolvedenumerationstatustoacasewhen theavailabledataisinadequatetodeterminewhetherthepersonisenumeratedinthecensusandaprobabilityofbeingcorrectlyenumeratedisimputedforsuchcases.Also,errorintheresolvedcasescauseserrorintheimpu-tations,becausetheresolvedcasesareusedtoformtheimputations.Eveniftheimputationmodelwereperfect,theimputationswillhaveerrorifthedatausedtofitthe modelhaserror.Thistypeoferroriscalledimputationerrorduetodataerror.AlthoughonecanconsidertherangeofeffectsontheDSEbyconsideringextremealternativese.g.,allunresolvedmatchestrulyarematchesortrulyarenonmatchesthe rangeistoowidetobeinformativeaboutthelikelybias.
Thebiasfromthemethodusedtocompensateformissingdatacaninprinciplebeestimatedfromintensivefollow-upofcaseswithmissingdata,butinpracticethe fractioncompletedbyfollow-upistoolow.TheCensusBureauanalyzedthemissing-databiasbylookingatthechangesintheDSEwhenalternativemethodswereused tocompensateformissingdata.ResultsfromtheAnalysisofReasonableAlternativeImpu-tationModelsareusedtoestimatethevariancecompo-nent.SeeKeathleyetal.(2001)fordetails.TheresultsofSectionIAppendixDD-3ErrorProfileforA.C.E.EstimatesU.S.CensusBureau,Census2000 ReasonableAlternativeImputationModelsprovidedthedataforthecalculationofthevariance-covariancematrix foradjustmentfactorsthemissingdatacomponent.The missingdatavariance-covariancematrixwasaddedtothe samplingvariance-covariancematrixtoobtainavariance-covariancematrixfortheadjustmentfactorsthatcon-tainedtherandomerrorduetosamplingandimputation formissingdata.
Definition.TheCensusBureaumodelstheerrorduetoimputationasarandomeffectandestimatesitsvariance-covariancematrix.Modelingimputationerrorasarandomeffectismotivatedbypracticalities.Inprinciple,thebiasfromthemethodusedtocompensateformissingdatacan beestimatedfromintensivefollow-upofcaseswithmiss-ingdata,butinpracticethefractioncompletedbyfollow-upistoolow.Thevariancecomponentdueto imputationformissingdatahasthreecomponents.
V MvarianceduetoimputationV RAV BV I where V RAvarianceduetotheimputationmodelselection V Bvarianceduetothemodelparameterestimation V Iwithin-personimputationvariance.Theimputationvariancecomponentsduetoparameterestimationandwithinpersonestimationareincludedinthesamplingerrorestimates,leavingthevariancedueto modelselectiontobeestimatedseparately.Themissingdatavariance-covariancematrixisaddedtothesamplingvariance-covariancematrixtoobtainavariance-covariance matrixfortheadjustmentfactorsthatcontainedtheran-domerrorduetosamplingandimputationformissing data.Thecomponentsofimputationerrorduetodataerroraffectestimateofthenumberofnonmovers,theestimateofthenumberofnonmoversenumerated,theestimateof thematchratefortheoutmovers,andtheestimateofthenumberoferroneousenumerations.Thepost-stratumindexhissuppressedinthefollowingdefinitions.
m nminetimputationerrorduetodataerrorinthenonmovercomponentof Mm ominetimputationerrorduetodataerrorintheoutmovermatchratecomponentof Mn nminetimputationerrorduetodataerrorinthenonmovercomponentof NP ee inetimputationerrorduetodataerrorin CE.Undertheassumptionthatallothererrorsarezero,thebiasintheadjustmentfactor Ahcausedbyimputationerrorduetodataerrorisdefinedas Bimpdata , hAhC hII h C hCEhBCEimpdata , h NE, h[NP, hBPimpdata , h]MhBMimpdata , hTheerrorcomponentdefinitionsincludearatioadjust-mentbecausetheyareestimatedusingtheEvaluationSample.TheratioadjustmentforcomponentsfromtheP-sampleistheratiooftheP-samplepopulationtotalfrom theA.C.E.totheP-samplepopulationtotalbasedonthe EvaluationSample.Theratioadjustmentforthecompo-nentsfromtheESampleistheratioofthetwoE-sample totalsdefinedcomparably.
BMimpdata[m nmim omi][NPNPF]BPimpdata[n nmi][NPNPF]BCEimpdata[ee i][NENEF]SamplingError
Source.Samplingerrorgivesrisetorandomerror,quan-tifiedbysamplingvariance,andtoasystematicerror knownasratio-estimatorbias.Thesamplingvarianceispresentinanyestimatebasedonasampleinsteadofthewholepopulation.Ratio-estimatorbiasarisesbecause evenifXandYareunbiasedestimators,X/Ytypicallyis biased.Definition.Thesamplingvarianceandratio-estimatorbiasfortheadjustmentfactorsare S 2samplingvariance-covariancematrixfortheadjustmentfactors Bratio , hratio-estimatorbiasintheadjustmentfactor Ahforpost-stratumh.Randomsamplingerrorisreflectedintheestimatedvariance-covariancematrixofthe Ahs.ThecovariancematrixisestimatedbytheCensusBureaussampling-errorsoftwareappliedtotheA.C.E.data.Thesoftwarealsocanbeusedtoproduceestimatesofratio-estimatorbias.CorrelationBias Source.Ifthereisvariabilityoftheenumerationprob-abilitiesforpersonsinthesamepost-stratum,orifthere isadependencebetweenenumerationinthecensusandintheA.C.E.e.g.,peoplelesslikelytobeenumeratedinthecensusmayalsobelesslikelytobefoundinthe A.C.E.,thencorrelationbiasmayarise.CorrelationbiasismostlikelyasourceofdownwardbiasintheDSE.Evi-denceofcorrelationbiasinnationalestimatesisprovided bysexratios(malestofemales)foradjustednumbersthat arelowrelativetoratiosderivedfromdemographicanaly-sisofdataonbirths,deaths,andmigration.Theinformationfromdemographicanalysisisinsufficienttoestimatecorrelationbiasatthepost-stratumlevel,how-ever,andalternativeparametricmodelshavebeenusedtoallocatecorrelationbiasestimatesfornationalage-race-sexgroupsdowntopost-strata.Estimatesofcorrelation biasatthenationallevelprovidedbydemographicanaly-sisinformationalsoaccountforpossibleerrorfromgroupswhoseprobabilitiesofenumerationaresolowthattheDSEwillfailtoaccountforthem.Theestimatesofcor-relationbiasbasedonsexratiosareaffectedbyerrorinthedemographic-analysissexratiosandbypossibleotherbiasesinthesexratiosintheDSE.D-4SectionIAppendixDErrorProfileforA.C.E.EstimatesU.S.CensusBureau,Census2000 TheassumptionsandmodelunderlyingthemeasurementofcorrelationbiasarediscussedindetailinBell (2001,2001b),butaredescribedbrieflyhere.Although thereareseveralmodelsforhowcorrelationbiasisdis-tributed,thetwo-groupmodelwasselected.Thetwo-groupmodelreliesonthebasicassumptionslistedbelow fortheestimationofcorrelationbias.Inaddition,sensitiv-ityanalysesassesstheimpactofvariationsinthese
assumptions.*Theratioofmalestofemalesmeasuredindemographicanalysisismorereliableforthetworacialgroups,Blackandnon-Black,thantheA.C.E.*ThereisnocorrelationbiaspresentintheA.C.E.esti-matesforfemales.*TherelativecorrelationbiasisequalacrossallA.C.E.post-stratawithinanage-racecategory.*Therelativeimpactofothernonsamplingerrorsisequalformalesandfemalesatthenationallevel.Theassumptionwiththetwo-groupmodeloftherelativecorrelationbiasbeingequalacrosspost-stratawithinanage-sexcategoryhastheadvantageofpermittingtheesti-mationofcorrelationbiasthroughamultiplicativefactor appliedtothecorrectedDSE.Evenmoreimportant,anunbiasedestimateofthefactorisavailableundertheassumptionthattherelativeimpactoftheothernonsam-plingerrorsisequalformalesandfemaleswithout actuallyhavingtoestimatethenonsamplingerrors.
Definition.CorrelationbiasusuallycausestheDSEtounderestimatethepopulationsize.
Bcorrel , hcorrelationbiasintheadjustmentfactor Ah forpost-stratumh.Excluded-dataErrorfromReinstatedCensus Enumerations Source.TheDSEtreatslatecensusdataasnonenumera-tions.Thus,duplicateenumerationsamongthelatedata donotcontributetocensusdata,butvalidenumerationsamongthelatedataaretreatedascensusmissesandareestimatedbytheDSE.Ifthelatecensusdatawere excludedfromtheentireadjustmentprocessandestima-tion,nonewsourceoferrorwouldbepresent.Theadjustedestimatesdopartiallyincorporatelatecensus databyincludingtheminC h,butexcludingthemfromthecomputationof Nh.Thisuseoflatedataaffectstheesti-matesforareaswithdisproportionatelymanyorfewlateadds,withaneffectthatissimilartosyntheticestimation error.Inaddition,theexclusionoflatecensusdatafrom theEsamplecouldbiastheestimatesatthepost-stratum level.Therearetwoconditionsthathavetobemetfortheexclusionofthelateaddsfromtheprocessingofthe A.C.E.nottobiasthedualsystemestimatesatthepost-stratumlevel:*ThePsamplecoversthecorrectenumerationsamongthelateaddsatthesamerateasothercorrectenumera-
tions.*ThelateaddsoccurintheEsampleatthesamerateastheyoccurinthecensus(excludingtheimputations).
Definition.ErrorduetoexcludingthereinstatedcensusenumerationsinthecalculationoftheDSEaffectstheesti-mateoftheDSEandthereforetheadjustmentfactordirectly.Breinstate , hbiasintheadjustmentfactor Ahforpost-stratumhduetoexcludingreinstatedcensusenumerations.ContaminationError Source.ContaminationoccurswhentheA.C.E.selectionofagivenblockclusteralterstheimplementationofthecensusthereandaffectsenumerationresults,e.g,by increasingordecreasingerroneousenumerationsorbyincreasingordecreasingcoveragerates.Contactwithresi-dentsofthesampleblocksduringthelistingforthe P-samplemaycausethemtonotrespondtothecensus,becausetheythinkthatthelistingcontactisaresponsetothecensus.
Definition.Thebiasintheadjustmentfactorforpost-stratumhfromcontaminationisdefinedasfollows.
Bcontam , hbiasintheadjustmentfactor Ahforpost-stratumhduetocontaminationerror.SyntheticEstimationBias Source.Theadjustmentmethodologyreliesonamethodcalledsyntheticestimationtoprovidethesameadjust-mentfactor Ahforallenumerationsinagivenpost-stratum,regardlessofwhethertheenumerationsarefromthesamegeographicarea.Syntheticestimationbiasariseswhenthecensusfromdifferentareasbutinthesame post-stratumshouldhavedifferentadjustmentfactors.Toassesssyntheticestimationbiasforagivenareaoneneedstodevelopanestimatebasedondatafromthearea alone,whichisrarelypossible.Attemptstoestimatesyn-theticestimationbiasinundercountestimatesfromanaly-sisofartificialpopulationsorsurrogatevariables, whosegeographicdistributionsareknown,areunconvinc-ing.Therefore,sensitivityanalyseshavebeenconductedtoassesstheimpactofsyntheticestimationbias.These studiesshowthatassumingsyntheticestimationhasa minoreffectonusesofthedataisreasonable.
Definition.Thesyntheticestimatesmaycauseabiasintheadjustedestimatesforareaj.Errorfromsyntheticesti-mationdoesnotaffectthedualsystemestimateforapost-stratum,onlyareaswithinapost-stratum.
Bsyn , jbiasintheadjustedestimate Nadj , jforareajduetosyntheticestimationerror.SectionIAppendixDD-5ErrorProfileforA.C.E.EstimatesU.S.CensusBureau,Census2000 InconsistentPost-stratification Source.Thecomputationof CEhNE, hrequirescensusenu-merationstobeassignedtopost-strata,andthecomputa-tionof NP, hMhrequiresP-sampleenumerationstobeassignedtopost-strata.Whentheassignmentsarenotmadeconsistentlyforthetwosamples,errorarisesinthe ratio NP, hMh.Inconsistentassignmentstopost-stratamaybecausedbymis-reportingofcharacteristicsusedinpost-stratification.CasesthataremostpronetoinconsistentclassificationarethosewherethereisadifferentrespondentforthehouseholdinthecensusandtheA.C.E.Forexample,ahouseholdmembersageorracemaybereporteddiffer-entlyinaself-responsethanwhenanotherhouseholdmembersrespondsfortheperson.Suchinconsistenciesalsomaybeduetocomputerprocessingerrors,aswellas inconsistentreporting.ThematchesintheA.C.E.sampleprovideasourceofdataforestimatingtheerrorduetoinconsistentpost-stratifi-cation.Anestimateoftheerrorforapost-stratummaybeformedbyassumingtheinconsistencyrateobservedinthematchesalsoholdsforthosenotmatched.
Definition.Errorduetoinconsistentpost-stratificationaffectstheestimateoftheDSEandthereforetheadjust-mentfactordirectly.
Binconsist , hbiasintheadjustmentfactor Ahforpost-stratumhduetoinconsistentpost-stratification.ErrorfromEstimatingOutmoverswithInmovers Source.ThiserrorisuniquetothePES-CmodelusedintheA.C.E.ForthePES-Cmodel,themembersofthe P-samplearetheresidentsofthehousingunitsonCensusDay.ThereissomedifficultyinidentifyingalltheresidentsofallthehousingunitsonCensusDaybecausesome movepriortotheA.C.E.interview.TheA.C.E.interviewreliesontherespondentstoidentifythosewhohavemovedout,theoutmovers.Sincetheoutmoversareiden-tifiedbyproxies,manyoftheoutmoversarenotrecorded.
Therefore,theestimateofoutmoversistoolow.PES-Cusesthenumberofinmoverstoestimatethenumberofoutmovers.Theinmoversarethosewhodidnotliveinthe sampleblocksonCensusDay,butmovedinpriortotheA.C.E.interview.Theoreticallythenumberofinmoversinthewholecountryshouldequalthenumberofoutmovers.However,thenumberofinmoversmaynotequalthenum-berofoutmoversinapost-stratumbecauseofcircum-stancessuchaseconomicconditionscausingmorepeople tomoveoutofanareathantomoveintoanarea.
Definition.Theerrorduetousingtheinmoverstoesti-matetheoutmoversaffectstheestimatesofthesizeoftheP-samplepopulationandthenumberofmatches.
m io , hnetP-sampledatacollectionerrorinthemovercomponentof Mhinpost-stratumh n io , hnetP-sampledatacollectionerrorinthemovercomponentof NP, hinpost-stratumh.Undertheassumptionthatallothererrorsarezero,thebiasintheadjustmentfactorscausedbyP-samplematch-ingerrorandE-sampleprocessingerrorisdefinedas Binout , hAhC hII h C hCEh NE, h[NP, hBPinout , h]MhBMinout , h.Theerrorcomponentdefinitionsincludearatioadjust-mentbecausetheyareestimatedusingtheEvaluation Sample.Theratioadjustmentforcomponentsfromthe P-sampleistheratiooftheP-samplepopulationtotalfromtheA.C.E.totheP-samplepopulationtotalbasedontheEvaluationSample.Thepost-stratumindexhissup-pressedinthefollowingdefinitions.
BMinout[m io][NPNPF]BPinout[n io][NPNPF]BalancingError Source.BalancingerrormustbeaddressedinthedesignofthesearchareasusedtosearchforE-samplecorrect enumerationsandP-samplematches.Limitingthesearchforcorrectenumerationsandmatchesisnecessarybecausethematchingoperationcannotsearchtheentire census.Bylimitingthesearcharea,asmallpercentageof correctenumerationswillnotbefoundandasmallper-centageofmatcheswillnotbefound.Thiscausesanunderestimateofthecorrectenumerationsandanunder-estimateofthematches.However,theestimateoftheneterrorisnotbiasedaslongasthepercentageerrorinthecorrectenumerationsequalsthepercentageerrorinthe matches.TheA.C.E.designavoidsbalancingerrorbychoosingthesameblockclustersfortheE-sampleandtheP-sampleanddrawingthesearchareasconsistently.
Definition.Thereisnotaseparatemeasurementofbal-ancingerror.AnybalancingerrorthatmayariseduringtheimplementationoftheA.C.E.willbeincludedinthemea-surementofdatacollectionerror.D-6SectionIAppendixDErrorProfileforA.C.E.EstimatesU.S.CensusBureau,Census2000 SectionI.ReferencesAdams,T.andLiu,X.(2001).ESCAPII:EvaluationofLackofBalanceandGeographicErrorsAffectingPersonEsti-mates,ExecutiveSteeringCommitteeforA.C.E.PolicyII,Report2.Bakerv.Carr369U.S.186(1962).
Belin,T.,Diffendal,G.,Mack,S.,Rubin,D.,Schafer,J.,andZaslavsky,A.(1993).HierarchicalLogisticRegressionModelsforImputationofUnresolvedEnumerationStatus inUndercountEstimation,JournaloftheAmericanStatis-ticalAssociation,88,1149-1166.Bell,W.(2001).AccuracyandCoverageEvaluation:Corre-lationBias,DSSDCensus2000ProceduresandOpera-tionsMemorandumSeries#B-12*.Bell,W.(2001b).ESCAPII:EstimationofCorrelationBiasin2000A.C.E.EstimatesUsingRevisedDemographic AnalysisResults,ExecutiveSteeringCommitteeforA.C.E.
PolicyII,Report10.Byrne,R.,Imel,L.,Ramos,M.,andStallone,P.(2001).AccuracyandCoverageEvaluation:PersonInterviewing Results,Census2000ProceduresandOperationsMemo-randumSeries#B-5*.Cantwell,P.(1999).AccuracyandCoverageEvaluationSurvey:OverviewofMissingDatafo rP&ESamples,DSSDCensus2000ProceduresandOperationsMemoran-dumSeries#Q-3.Cantwell,P.(2000).AccuracyandCoverageEvaluationSurvey:SpecificationsfortheMissingDataProcedures,DSSDCensus2000ProceduresandOperationsMemoran-dumSeries#Q-25.Cantwell,P.,McGrath,D.Nguyen,N.,andZelenak,M.(2001).AccuracyandCoverageEvaluation:MissingDataResults,DSSDCensus2000ProceduresandOperations MemorandumSeries#B-7*.Childers,D.(2001).AccuracyandCoverageEvaluation:TheDesignDocument,Census2000Proceduresand OperationsMemorandumSeries,ChapterS-DT-1,Revised.Childers,D.,Byrne,R.,Adams,T.,andFeldpausch,R.(2001).AccuracyandCoverageEvaluation:PersonMatch-ingandFollowupResults,Census2000ProceduresandOperationsMemorandumSeries#B-6*.Coale,A.(1955).ThePopulationoftheUnitedStatesin1950ClassifiedbyAge,Sex,andColorARevisionofCen-susFigures,JournaloftheAmericanStatisticalAssocia-tion,50,16-54.Coale,A.andRives,N.(1973).AStatisticalReconstruc-tionoftheBlackPopulationoftheUnitedStates,1880-1970:EstimatesofTrueNumbersbyAgeandSex,BirthRates,andTotalFertility,PopulationIndex,39(1),3-36.Coale,A.andZelnick,M.(1963).NewEstimatesofFertil-ityandPopulationintheUnitedStates,PrincetonUniver-sityPress.Cox,L.andErnst,L.(1982).ControlledRounding,INFOR,Vol.20,No.4.Davis,M.(1991).PreliminaryFinalReportforPESEvalua-tionProjectP7:EstimatesofP-sampleClericalMatching ErrorfromaRematchingEvaluation,1990CoverageStud-iesandEvaluationMemorandumSeries#H-2.Davis,M.(1991b).PreliminaryFinalReportforPESEvalu-ationProjectP10:MeasurementoftheCensusErroneous Enumerations-ClericalErrorMadeintheAssignmentofEnumerationStatus,1990CoverageStudiesandEvalua-tionMemorandumSeries#L-2.Davis,P.(2001).AccuracyandCoverageEvaluation:DualSystemEstimationResults,DSSDCensus2000Proce-duresandOperationsMemorandumSeries#B-9*.Fay,R.,Passel,J.,Robinson,J.G.,andCowan,C.(1988).TheCoverageofPopulationinthe1980Census,1980CensusofPopulationandHousing:Evaluationand ResearchReports,PHC80-E4,U.S.BureauoftheCensus,Washington,D.C.Ghosh,M.andRao,J.N.K.(1994).SmallAreaEstimation:AnAppraisal,StatisticalScience,Vol.9,No.1,55-93.Gonzalez,M.(1973).UseandEvaluationofSyntheticEsti-mators,ProceedingsoftheSocialStatisticsSection, AmericanStatisticalAssociation.Gonzalez,M.andWaksberg,J.(1973).EstimationoftheErrorofSyntheticEstimates,paperpresentedatthefirst meetingoftheInternationalAssociationofSurveyStatisti-cians,Vienna,Austria,August18-25,1973.Griffin,R.(1999).AccuracyandCoverageEvaluationSur-vey:Post-stratificationResearchMethodology,DSSDCen-sus2000ProceduresandOperationsMemorandumSeries
- Q-5.SectionIReferences1ReferencesU.S.CensusBureau,Census2000 Griffin,R.andHaines,D.(2000).AccuracyandCoverageEvaluationSurvey:PoststratificationforDualSystemEsti-mation,DSSDCensus2000ProceduresandOperations MemorandumSeries#Q-21.Griffin,R.andHaines,D.(2000b).AccuracyandCoverageEvaluationSurvey:FinalPoststratificationPlanforDualSystemEstimation,DSSDCensus2000ProceduresandOperationsMemorandumSeries#Q-24.Griffin,R.andMalec,D.(2001).AccuracyandCoverageEvaluation:AssessmentofSyntheticAssumption,DSSDCensus2000ProceduresandOperationsMemorandum Series#B-14*.Haines,D.(2001).AccuracyandCoverageEvaluationSur-vey:SyntheticEstimation,DSSDCensus2000Procedures andOperationsMemorandumSeries#Q-46.Haines,D.(2001b).AccuracyandCoverageEvaluationSurvey:ComputerSpecificationsforPersonDualSystem Estimation(U.S.)-Re-issueofQ-37,DSSDCensus2000ProceduresandOperationsMemorandumSeries#Q-48.Himes,C.andClogg,C.(1992).AnOverviewofDemo-graphicAnalysisasaMethodforEvaluatingCensusCov-erageintheUnitedStates,PopulationIndex,58(4),587-607.Hogan,H.(1992).The1990Post-EnumerationSurvey:AnOverview,TheAmericanStatistician,Vol.46(4),261-269.Hogan,H.(1993).The1990Post-EnumerationSurvey:OperationsandResults,JournaloftheAmericanStatisti-calAssociation,88,1047-1060.Hogan,H.(2000).TheAccuracyandCoverageEvaluation:TheoryandApplication,ProceedingsoftheSurveyResearchMethodsSection,AmericanStatisticalAssocia-
tion.Hogan,H.(2001).AccuracyandCoverageEvaluationSur-vey:EffectofExcludingLateCensusAdds,DSSDCensus 2000ProceduresandOperationsMemorandumSeries
- Q-43.Ikeda,M.(1997).EffectofUsingthe1996ICMCharacter-isticImputationandProbabilityModelingMethodologyonthe1995ICMPandE-SampleData,DSSDCensus2000DressRehearsalMemorandumSeries#A-20.Ikeda,M.(1998).EffectofDifferentMethodsforCalculat-ingMatchandResidenceProbabilitiesforthe1995P-SampleData,DSSDCensus2000DressRehearsal MemorandumSeries#A-23.Ikeda,M.(1998b).EffectofDifferentMethodsforCalcu-latingCorrectEnumerationProbabilitiesforthe1995 E-SampleData,DSSDCensus2000DressRehearsal MemorandumSeries#A-28.Ikeda,M.(1998c).EffectofUsingSimpleRatioMethodstoCalculateP-SampleResidenceProbabilitiesand E-SampleCorrectEnumerationProbabilitiesforthe1995 Data,DSSDCensus2000DressRehearsalMemorandum Series#A-30.Ikeda,M.,Kearney,A.,andPetroni,R.(1998).MissingDataProceduresintheCensus2000DressRehearsalInte-gratedCoverageMeasurementSample,Proceedingsof theSurveyResearchMethodsSection,AmericanStatistical Association.Ikeda,M.andMcGrath,D.(2001).AccuracyandCoverageEvaluationSurvey:SpecificationsfortheMissingDataPro-cedures;RevisionofQ-25,DSSDCensus2000ProceduresandOperationsMemorandumSeries#Q-62.Kearney,A.andIkeda,M.(1999).HandlingofMissingDataintheCensus2000DressRehearsalIntegratedCov-erageMeasurementSample,ProceedingsoftheSurvey ResearchMethodsSection,AmericanStatisticalAssocia-tion.Keathley,D.,Kearney,A.,andBell,W.(2001).ESCAPII:AnalysisofMissingDataAlternativesfortheAccuracyandCoverageEvaluation,ExecutiveSteeringCommitteeforA.C.E.PolicyII,Report12.Keeley,C.(2000).Census2000AccuracyandCoverageEvaluationComputerAssistedInterview,DSSDCensus2000ProceduresandOperationsMemorandumSeries
- S-QD-02.Killion,R.A.(1998).EstimationDecisionsfortheInte-gratedCoverageMeasurementSurveyforCensus2000, Census2000DecisionMemorandumNo.42.Kostanich,D.,Griffin,R.,andFenstermaker,D.(1999).Census2000AccuracyandCoverageEvaluationSurvey:SampleAllocationandPost-stratificationPlans,DSSDCen-sus2000ProceduresandOperationsMemorandumSeries
- R-2.Marks,E.(1979).TheRoleofDualSystemEstimationinCensusEvaluation,inK.Krotki(Ed.),Developmentsin DualSystemEstimationofPopulationSizeandGrowth, Edmonton:UniversityofAlbertaPress,156-188.Nash,F.(2000).OverviewoftheDuplicateHousingUnitOperations,Census2000InformationMemorandumNumber78.NationalCenterforHealthStatistics(1968).SyntheticStateEstimatesofDisability,P.H.S.Publication1759,U.S.GovernmentPrintingOffice,Washington,D.C.Raglin,D.andBean,S.(1999).OutmoverTracingandInterviewing,Census2000DressRehearsalEvaluation ResultsMemorandumSeries#C-3.Rao,J.N.K.andShao,J.(1992).JackknifeVarianceEstima-tionwithSurveyDataUnderHotDeckImputation,Biometrika,79,811-822.Reynoldsv.Simms,377U.S.533(1964).2SectionIReferencesReferencesU.S.CensusBureau,Census2000 Robinson,J.G.(2001).ESCAPII:DemographicAnalysisResults,ExecutiveSteeringCommitteeforA.C.E.PolicyII, Report1.Robinson,J.G.,Ahmed,B.,Gupta,P.,andWoodrow,K.(1993).EstimationofPopulationCoverageinthe1990 UnitedStatesCensusBasedonDemographicAnalysis,JournaloftheAmericanStatisticalAssociation,88,1061-1071.Schindler,E.(1998).AllocationoftheICMSampletotheStatesforCensus2000,ProceedingsoftheSurvey ResearchMethodsSection,AmericanStatisticalAssocia-tion.Schindler,E.(1999).ComparisonofDSECandDSEA,Census2000DressRehearsalEvaluationMemorandum#C-8a.Schindler,E.(2000).AccuracyandCoverageEvaluationSurvey:Post-stratificationPreliminaryResearchResults, DSSDCensus2000ProceduresandOperationsMemoran-dumSeries#Q-23.Sekar,C.C.andDeming,W.E.(1949).OnaMethodofEstimatingBirthandDeathRatesandtheExtentofRegis-tration,JournaloftheAmericanStatisticalAssociation, 44,101-115.Siegel,J.(1974).EstimatesofCoverageofPopulationbySex,Race,andAge:DemographicAnalysis,1970Census ofPopulationandHousing:EvaluationandResearchPro-gram,PHC(E)-4,U.S.BureauoftheCensus,Washington, D.C.Siegel,J.andZelnik,M.(1966).AnEvaluationofCover-ageinthe1960CensusofPopulationbyTechniquesof DemographicAnalysisandbyCompositeMethods,Pro-ceedingsoftheSocialStatisticsSection,AmericanStatisti-calAssociation.Starsinic,M.(2001).AccuracyandCoverageEvaluationSurvey:SpecificationsforCovarianceMatrixOutputFiles fromVarianceEstimationforCensus2000,DSSDCensus 2000ProceduresandOperationsMemorandumSeries
- V-4.Starsinic,M.andKim,J.(2001).AccuracyandCoverageEvaluation:ComputerSpecificationsforVarianceEstima-tionforCensus2000-Revision,DSSDCensus2000Pro-ceduresandOperationsMemorandumSeries#V-5.U.S.BureauoftheCensus(1985).EvaluatingCensusesofPopulationandHousing,StatisticalTrainingDocument, ISP-TR-5,Washington,D.C.West,K.(1991).FinalReportforPESEvaluationProjectP4:QualityofReportedCensusDayAddress-EvaluationFollow-up,1990CoverageStudiesandEvaluationMemo-randumSeries#D-2.Winkler,W.(1994).AdvancedMethodsforRecordLink-age,ProceedingsoftheSurveyResearchMethodsSec-tion,AmericanStatisticalAssociation.Wolfgang,G.(1999).RequestforDressRehearsalSur-roundingBlockFilesforA.C.E.Research,unpublishedCensusBureauMemorandum.Wolter,K.(1986).SomeCoverageErrorModelsforCensusData,JournaloftheAmericanStatisticalAssociation,81, 338-346.Woltman,H.,Alberti,N.,andMoriarity,C.(1988).SampleDesignforthe1990CensusPostEnumerationSurvey,ProceedingsoftheSurveyResearchMethodsSection, AmericanStatisticalAssociation.ZuWallack,R.(2000).SampleDesignfortheCensus2000AccuracyandCoverageEvaluation,ProceedingsoftheSurveyResearchMethodsSection,AmericanStatistical
Association.SectionIReferences3ReferencesU.S.CensusBureau,Census2000 AccuracyandCoverageEvaluationofCensus2000:DesignandMethodologySectionIIA.C.E.RevisionIIMarch2003U.S.CensusBureau,Census2000 Chapter1.IntroductiontoA.C.E.RevisionII INTRODUCTIONTheAccuracyandCoverageEvaluation(A.C.E.)surveywasdesignedtomeasureandpossiblycorrectnetcoverageerrorinCensus2000.However,becauseA.C.E.failedto measureasignificantnumberoferroneousenumerations,A.C.E.didnotmeettheseobjectives.TheCensusBureausExecutiveSteeringCommitteeforA.C.E.Policy(ESCAP) recommendedtwice NOTtocorrectthecensuscounts.
1Thereare,however,concernsaboutdifferentialcoverageerrorinCensus2000data.WhiletheCensus2000dataproductswillnotbecorrected,itispossiblethatimprove-mentscouldbemadetothepost-censalpopulationesti-matesusedforsurveycontrols.ThisistheCensusBureausmotivationforcorrectingerrorsintheA.C.E.dataanddevelopingimprovedestimatesofthenetundercount.
TheimprovedestimatesarecalledA.C.E.RevisionIIesti-mates.TheserevisedestimatesprovideabetterpictureofCensus2000coverageandwillhelpusdesignabettercoveragemeasurementprogramfor2010.Thispartofthe documentprovidesadescriptionofthemethodologyusedtoproducetheA.C.E.RevisionIIestimates.Acomprehen-sivetechnicaldescriptionofthemethodologyusedtopro-ducetheoriginalestimatesofnetundercountreleasedinMarch2001ispresentedinthefirsthalfofthispublication.Thischaptersummarizesthehistoryofthetwoadjust-mentdecisionsanddiscusseskeyfindingsandlimitations.Italsointroducesthekeycomponentsoftherevisionand describesthemajorerrorsbeingcorrected.Thenext chapterprovidesanoverviewoftherevisionprocessandsubsequentchaptersprovidedetailedmethodologyas
follows:Chapter2:SummaryofA.C.E.RevisionII MethodologyChapter3:CorrectingDataforMeasurementError Chapter4:A.C.E.RevisionIIMissingDataMethodsChapter5:FurtherStudyofPersonDuplicationinCensus2000Chapter6:A.C.E.RevisionIIEstimationChapter7:AssessingtheEstimates BACKGROUNDTheoriginalA.C.E.estimateswereavailableinFebruaryof2001,intimetoallowforthepossibilityofcorrecting Census2000redistrictingfiles.TheCensusBureausESCAPrecommendedinMarch2001nottocorrecttheCensus2000countsforpurposesofredistricting(ESCAPI, 2001).TheSecretaryofCommerceconcurred.Giventheinformationavailableatthistime,thisdecisionwasnotbasedonanyclearevidencethatthecensuscountswere moreaccurate,butratherconcernthattherewassomeyetundiscoverederrorintheA.C.E.TheA.C.Eestimateofa3.3millionnetundercountwasmuchlargerthanthe DemographicAnalysis(DA)estimateofonly340,000.
Furtherevaluationswereconductedoverthenext6monthstoexaminethereasonsforthediscrepancyandtodetermineifotherCensus2000dataproductsshouldbe corrected.TheCensus2000redistrictingfileswerethefirstofmanyCensus2000dataproductsscheduledforpublicrelease.SeetheCensusBureausWebsite, www.census.gov,forreleaseddataproducts.Thequestion remainedastowhethertheseotherCensus2000datareleasesshouldbecorrected.InOctober2001,theESCAPagaindecidednottocorrectthecensuscountsforotherCensus2000dataproducts.
AnalysisofA.C.E.evaluationdataandastudyofdupli-catesinthecensusrevealedthattheA.C.E.failedtomeasurelargenumbersoferroneouscensusenumerations (ESCAPII,2001).ThiserrorcalledintoquestionthequalityoftheA.C.E.surveyresults.Someofthekeyfindingsfromtheanalysesare:*AnevaluationbyKrejsaandRaglin(2001)wasthefirstindicationthatA.C.E.seriouslyunderestimatederrone-ousenumerations.Thisanalysisrevealedanadditionalnet1.9millionerroneouslyenumeratedpersonsforthosecasesthatcouldberesolved.Theseresultsare basedonanindependentreinterviewandmatchingof about70,000E-samplepersons.Becauseoftheseriousimplicationsofthisfinding,afurtherReviewStudywas conducted.*ThefindingsfromtheReviewStudybyAdamsandKrejsa(2001)showedthatA.C.E.underestimatederro-neousenumerationsbyanetof1.45millionpersons,whichwassmallerthantheevaluationfigurebutstillasignificantlylargeamount.Thisfiguredoesnotinclude unresolvedcases,sotheestimatedamountisprobablysomewhathigher.Thisstudywasbasedonasampleof 1TheESCAPrecommendations,supportinganalyses,technicalassessments,andlimitationscanbefoundontheCensusBureausWebsiteatwww.census.gov/dmd/www/EscapRep.html.SectionIIChapter11-1IntroductiontoA.C.E.RevisionIIU.S.CensusBureau,Census2000 about17,000personsselectedfromthe70,000personevaluationfollow-upEsample.Themostexperienced analystsreviewedthesecasesusingboththeoriginal A.C.E.personfollow-upinterviewsaswellasthereinter-viewresultstodeterminetheirenumerationstatus.*Mule(2001)showedthatCensus2000suffersfromalargenumberofduplicateenumerations,i.e.,personswhoweredoublecounted.Mulecomputer-matched censusenumerationsinA.C.E.blockclusterstothoseacrosstheentirecountry.ThematchingusedbyMulewasconservativeinpickingupcensusduplicatesgiven hisrequirementforexactmatchingatthefirststage.WithintheA.C.E.blockclusters,Mulefoundonly38percentofthein-scopeduplicatesthatA.C.E.found, leadingustobelievethathismatchingalgorithmwasunderestimatingduplicatesinthecensus.NotethatA.C.E.wasnotdesignedtoestimateduplicatesoutside thesearcharea,andthis,itself,wasnotadesignflaw.A.C.E.was,however,expectedtodeterminewhichcensusenumerationswereerroneousbecausetheywere reportedatthewrongresidence.ThedesignofA.C.E.
RevisionIIaccountsforduplicatesoutsidethesearcharea.Mulesstudydidnotdistinguishwhichoftheduplicatepairwascorrectandwhichwaserroneous, butonecouldeasilyspeculatethathalfoftheseshouldbecorrectandhalfshouldbeerroneous.*Feldpausch(2001)examinedtheA.C.E.enumerationstatusforE-samplecasesidentifiedbyMule(2001)asduplicatesoutsidethesearcharea.Only14percentoftheE-samplepersonsthatwereduplicatesofaperson inahousingunitwerecodedaserroneousbyA.C.E.
Thiswasmuchlowerthantheexpected50percent,indicatingthatA.C.E.underestimatederroneousenu-merationsduetonotperceivingthattheseE-sample personsshouldhavebeencountedatotherresidences.NotethattheseresultssuggestmeasurementerrorintheoriginalA.C.E.figuresreleased.*Fay(2001,2002)thencomparedtheenumerationstatusfortheE-sampleReviewStudycasestotheduplicatesidentifiedbyMule(2001)outsidethesearcharea.Only 19percentofthereviewcasesthatwereduplicatesofapersoninahousingunitwerecodedaserroneousbytheReviewStudy.Again,thiswasmuchlessthanthe expected50percent,indicatingthattheevaluationdata andthespecialreviewdidnotidentifyalltheerroneousenumerations.Usingthesedata,Faythenproducedalowerboundonthelevelofunmeasurederroneous enumerationsof2.9million.*Therewasalsoevidencethatsimilarproblemsmayhaveaffectedthepopulationsample(Psample)whichis usedtomeasuretheomissionrate.A.C.E.evaluationdatafromRaglinandKrejsa(2001)showthattherearemeasurementerrorsindeterminingresidencyand moverstatus.UsingFayslowerboundonthelevelofunmeasurederro-neousenumerations,Thompsonetal.(2001)produceda RevisedEarlyApproximationofundercountforthree race/Hispanicorigingroups.Theseestimateswere intendedtobeillustrativeofnetundercountandpossible coveragedifferences.Thesamemethodologyanddata werelaterusedtoexpandthecalculationstoseven race/Hispanicorigingroups.SeeFay(2002)andMule (2002)fordetails.Thesepreliminaryestimatesshowa verysmallnetundercount.Thedataalsoindicatethatthe differentialundercounthasnotbeeneliminated.These resultsarelimitedtotheextentthattheyonlyprovide informationatthenationallevelforbroadpopulation groups.Furthermore,thesepreliminaryapproximations werebasedonasmallsubsetofA.C.Edataandonlypar-tiallycorrectforerrorsinmeasuringerroneousenumera-tions.Potentialerrorsinmeasuringomissionswerenot accountedfor.Insummary,theA.C.E.resultswerenotacceptablebecauseA.C.E.failedtomeasurelargenumbersoferrone-ouscensusenumerations.Thiswasthereasonfornot usingtheA.C.E.,butthisdoesnotmeanthattherewere noothererrorsintheA.C.E.Inparticular,therewascon-cernaboutP-samplecasesthatmatchedtoenumerations suspectedofbeingduplicates.IftheE-samplecasewas erroneous,thenthatmatchcannotbevalid.Theextentof thisproblemwasnotquantifiedatthetimeoftheESCAPIIdecision.Thelevelofothererrorswassmallbycompari-son,andtherefore,wasnotamajorfactorinthisdecision.
SeeHoganetal.(2002)andMulryandPetroni(2002)for furtherinformation.PlansforRevisingthe2000A.C.E.EstimatesEventhoughtheESCAPrecommendedtwicenottocorrectthecensuscounts,theyhadconcernsaboutdifferential coverageerrorinCensus2000data.Theythoughtitpos-siblethatfurtherresearchresultinginrevisedestimatesof coveragecouldpotentiallybeusedtoimprovethepost-censalestimates.Inaddition,revisedestimateswould provideabetterunderstandingofCensus2000coverage errorthatcouldbeusedtoimprovecensusoperationsfor 2010andwouldhelpindevelopingbettermethodologiesforthe2010coveragemeasurementprogram.ThemajorobjectivewastoproduceimprovedestimatesofthehouseholdpopulationthatcouldbeusedtomeasurenetcoverageerrorinCensus2000.Thismeantobtaining betterestimatesoferroneouscensusenumerationfromtheEsampleandobtainingbetterestimatesofcensusomissionsfromthePsample.Furthermore,sincethenationalnetundercount,asindicatedbybothDAandthe1-2SectionIIChapter1IntroductiontoA.C.E.RevisionIIU.S.CensusBureau,Census2000 RevisedEarlyApproximations,wasveryclosetozeroandthecensusincludedlargenumbersoferroneousenu-merationsintheformofduplicates,itwasimperativethat therevisedmethodologycarefullyaccountforbothover-countsandundercounts.Hogan(2002)summarizedthe majorrevisionissuesintheformofthefollowingfive
challenges:1.Improveestimatesoferroneouscensusenumerations2.Improveestimatesofcensusomissions 3.Developnewmodelsformissingdata 4.Enhancetheestimationpost-stratification5.ConsideradjustmentforcorrelationbiasTherewerenofieldoperationsassociatedwiththeA.C.E.RevisionIIprocess.Becauseofthelatedate,itwasnot feasible(orpractical)torevisithouseholdsforadditional datacollection.Consequently,therevisionswerebasedondatathathadalreadybeencollected.Oneaspectofthestrategyforrevisingthecoverageestimatesinvolvescor-rectingmeasurementerrorusinginformationfromtheA.C.E.evaluationdata.Thisisreferredtoastherecodingoperation.Anotheraspectofthesecorrectionsinvolves conductingamoreextensivepersonduplicatestudytocorrectformeasurementerrorthatwasnotdetectedbyA.C.E.evaluations.ThisisreferredtoastheFurtherStudy ofPersonDuplication(FSPD).Theestimationmethod, discussedbrieflyinChapter2andmorefullyinChapter6,isdesignedtohandleoverlapoferrorsdetectedbybothofthesestudiestoavoidovercorrectingformeasurement error.Therecodingoperationwasdesignedtoimproveesti-matesoferroneouscensusenumerationsandcensus omissions.ItusestheoriginalA.C.E.personinterview(PI)andpersonfollow-up(PFU),theevaluationfollow-upinter-view(EFU),thematchingerrorstudy(MES),andthe PFU/EFUReviewStudy 2tocorrectformeasurementerrorinenumerationstatus,residencestatus,moverstatusand matchingstatus.Thiseffortinvolvedextensiverecodingofabout60,000P-samplecasesandmorethan70,000E-samplecases.
3Anautomatedcomputeralgorithmwasusedtorecodemostofthecases,butmanyrequiredaclericalreviewbyexperiencedanalystsattheNationalProcessingCenter.Theanalystshadaccesstotheques-tionnaireresponses,aswellasinterviewernotesthatputtheminabetterpositiontoresolveapparentdiscrepan-cies.Itwasnotpossibletocompletelycodeallcases becauseofmissingorconflictinginformation;however, thenumberofconflictingcaseswasrelativelysmall.Theduplicatestudywasdesignedtofurtherimproveestimatesoferroneouscensusenumerationsandcensusomissions.Thisstudyusedcomputermatchingandmod-elingtechniquestoidentifyE-andP-samplecasesthatlink tocensusenumerationsacrosstheentirecountry,includ-inggroupquarters,reinstated,anddeletedcensuscases.FortheE-samplelinks,thisstudydoesnotidentifywhich enumerationiscorrectandwhichistheduplicate.ForP-samplelinks,thisstudydoesnotidentifywhetherthecorrectCensusDayresidenceisattheP-samplelocation orthecensuslocation.ThisinformationisusedtomodeltheprobabilitythatanE-samplelinkedcaseisacorrectenumerationorthataP-samplecaseisaresidenton CensusDay.Newmissingdatamodelsweredevelopedtoreflectthedifferenttypesofmissingdatanowpossibleasaresultof therecodingoperation.Therewerethreenewtypesofmissingdatatodealwith:1.P-samplehouseholdsthatwereoriginallyconsideredinterviews,buttherecodingdeterminedthattherewerenovalidCensusDayresidents,2.caseswithunresolvedmatch,enumeration,orresi-dencystatusbecauseofincompleteorambiguousinterviewdata,and3.caseswithconflictingenumerationorresidencysta-tus,becausecontradictoryinformationwascollected intheA.C.E.PFUandEFUinterviews.Itwasimpossibletodeterminewhichdatawerevalidforthesecases.Ahouseholdnoninterviewweightingadjust-mentusingnewcelldefinitionswasusedfortype1above.Imputationcellsanddonorpoolsweredevelopedforthesecondtypeofmissingdatabasedondetailed responsestothequestionnaire.Fortheconflictingcasesintype3above,therewerenoapplicabledonorpools,andprobabilitiesof0.5wereimputedforcorrectenumera-tionstatusandCensusDayresidencystatus.Fortunately,therecodingoperationresultedinarelativelysmallnum-berofthesecases.Therevisioneffortincorporatesseparatepost-strataforestimatingcensusomissionsanderroneouscensusenu-merationsbecausethefactorsrelatedtoeachoftheseare likelytobedifferent.Theresearcheffortfocusedondeter-miningvariablesrelatedtoerroneousenumerations.Thiswasbecausemuchofthepreviousworkondevelopingpost-stratafocusedonlyonthecensusomissions,andby default,thesamepost-stratawereappliedtotheerrone-ousinclusions.FortheEsample,someoftheoriginalpost-stratificationvariableshavebeeneliminatedand additionalvariableshavebeenincluded.Variablessuchas 2ThePFU/EFUReviewStudywasnotaplannedevaluation.Itwasaspecialstudyconductedinasubsampleoftheevaluation datatoresolvediscrepanciesbetweenenumerationstatusinthe PFUandEFU.
3TheseareprobabilitysubsamplesoftheoriginalA.C.E.PandEsamplesandinthecontextofA.C.E.RevisionIIarecalledrevi-sionsamples,buttheyareinfactequivalenttotheevaluationfollow-upsamples.SectionIIChapter11-3IntroductiontoA.C.E.RevisionIIU.S.CensusBureau,Census2000 region,MetropolitanStatisticalArea/typeofcensusenu-merationarea,andtract-levelreturnratewerereplacedby proxystatus,typeanddateofcensusreturn,andhouse-holdrelationshipandsize.ForthePsample,onlytheage variablewasmodifiedtodefineseparatepost-stratafor childrenaged0to9andthose10to17.Thiswasdone becausetheDAestimatessuggesteddifferentcoverage forthesegroups.Theestimatedcorrectenumerationrates andestimatedmatchratesareusedtocalculateDual SystemEstimates(DSEs)forthecross-classificationofthe EandPpost-strata.TheA.C.E.RevisionIIDSEsincludeanadjustmentforcorrelationbias.Correlationbiasexistswhenevertheprobabilitythatanindividualisincludedinthecensusis dependentontheprobabilitythattheindividualisincludedintheA.C.E.Thisformofbiasgenerallyhasadownwardeffectonestimates,becausepeoplemissedin thecensusmaybemorelikelytoalsobemissedintheA.C.E.SincetheintentoftheA.C.E.RevisionIIistoesti-matethenetcoverageerror,itisimportanttocarefully accountforerrorsofomissionsanderrorsoferroneous inclusions.Inpreviouscoveragemeasurementsurveys,theerroneousinclusionswereassumedtobemuchsmallerthanomissions.Consequently,notadjustingfor correlationbiashadtheeffectofunderstatingthenetundercountandrelativetothecensuswasacorrectionthatwasintherightdirection,butjustnotbigenough.In thepresenceoflargenumbersofovercounts,thisassump-tionisnolongervalidanditspossiblethatacorrectionmightnotevenbeintherightdirectionwhentheestimateisclosetozero.Forexample,ifthereisasmalltruenet undercount,itspossibletoestimateanovercountbecause theDSEwouldunderestimatepopulationinthepresence ofcorrelationbias.Estimatesofcorrelationbiaswerecal-culatedusingthetwo-groupmodelandsexratiosfrom DA.Thesexratioisdefinedasthenumberofmales dividedbythenumberoffemales.Thismodelassumesno correlationbiasforfemalesorformaleslessthan18years ofage,andthatBlackmaleshavearelativecorrelation biasthatisdifferentfromtherelativecorrelationbiasfor non-Blackmales.Thecorrelationbiasadjustmentisalso donebythreeagecategories:18-29,30-49,and50and overwiththeexceptionofnon-Blackmales18to29years ofage.ThisisbecausetheA.C.E.RevisionIIsexratiosfor non-Blacks18-29exceedthecorrespondingmodifiedDA sexratioandislikelyaresultofadataproblem.This modelfurtherassumesthatrelativecorrelationbiasis constantovermalepost-stratawithinagegroups.TheDSEs,adjustedforcorrelationbias,areusedtopro-ducecoveragecorrectionfactorsforeachofthecross-classifiedpost-strata.Thesefactorsareappliedorcarrieddownwithinthepost-stratatoproduceestimatesfor geographicareassuchascountiesorplaces.Thisprocessisreferredtoassyntheticestimation.Thekeyassumptionunderlyingthismethodologyisthatthenetcensus coverage,estimatedbythecoveragecorrectionfactor,isrelativelyuniformwithinthepost-strata.Failureofthisassumptionleadstosyntheticerror.1-4SectionIIChapter1IntroductiontoA.C.E.RevisionIIU.S.CensusBureau,Census2000 Chapter2.SummaryofA.C.ERevisionIIMethodology INTRODUCTIONTheoriginalA.C.E.estimateswerefoundtobeunaccept-ablebecausetheyfailedtodetectsignificantnumbersof erroneouscensusenumerations.Therewerealsosuspi-cionsthattheA.C.E.mayhaveincludedresidentsinitsPsamplethatwereactuallynonresidents.Thus,themajor goalinrevisingtheA.C.E.estimatesincludedacorrectionofthesemeasurementerrors.Oneaspectofthesecorrec-tionsinvolvedcorrectingasubsampleoftheA.C.E.data.
Anotheraspectinvolvedcorrectingmeasurementerrorsthatcouldnotbedetectedwiththeinformationavailableinthesubsample.Theseadditionalerrorswereidentified viaaduplicatestudy.Thepurposeofthischapteristopresentahigh-leveloverviewoftheprocessusedtopro-duceA.C.E.RevisionIIestimatesofthepopulationcover-ageofCensus2000.Furtherdetailsconcerningthemeth-odologyandproceduresareincludedinsubsequent chapters.BackgroundThechronologyofeventsleadingtothecorrectedA.C.E.RevisionIIresultswereasfollows:1.TheA.C.E.estimatesproducedinMarch2001werebasedontheFullEandPsamples,whichwereprob-abilitysamplesofover700,000personsin11,303blockclusters.2.TheMatchingErrorStudy(MES)andtheEvaluationFollow-up(EFU)weretwoprogramsthatevaluatedthe March2001A.C.E.estimates.TheMESmeasurederrorsintroducedwhenthecensusandA.C.E.inter-viewswerematched.TheEFU,whichwasdesignedto studyunusuallivingsituations,entailedanotherinter-view.ItevaluatedtheCensusDayresidency,enumera-tionstatusandmoverstatusassignedduringthe A.C.E.interviewandA.C.E.PersonFollow-up(PFU)interview.TheMESandEFUwereconductedinasub-sampleof2,259blockclustersselectedfromtheorigi-nal11,303blockclusters.Afurthersubsampleofper-sonswithintheseblockclusterswasselectedfortheEFUevaluation.3.ThePFU/EFUReviewoccurrednext;itwasnotpartoftheplannedevaluations.Itwasdoneinorderto resolvemajordiscrepanciesinenumerationstatusbetweentheEFUandPFUresults.Thus,theReviewEsamplewasasubsampleoftheEFUEsample.4.AtthispointtheA.C.E.RevisionIIprogramcom-menced.TheRevisionEandPsamplesweredevel-opedforpurposesofproducingA.C.E.RevisionIIesti-mates.Theyareeachcomprisedofabout70,000samplepersons.ThesesampleswereessentiallythesameastheevaluationEandPsamplesforEFU,but thedatahaveundergoneamajorrecodingtocorrectformeasurementerror.Thesedata,alongwithothermeasurementerrorcorrectionsidentifiedbythedupli-catestudy,wereusedtoadjusttheFullEandPsamplestoproduceA.C.E.RevisionIIestimates.TheA.C.E.RevisionIIprocessispresentedbelow.First,thecorrectionsformeasurementerror(undetectederroneousenumerationsandP-samplenonresidents)intheRevision Samplesareexplained.Then,adiscussionisgivenofthe missingdatamethodsappliedtocaseswhosematch,resi-dencyorenumerationstatushadchangedintheRevisionSamples.Next,theprocessforidentifyingcensusdupli-catesacrosstheentirenationisdiscussed.Anapplicabledualsystemestimationformulathatincorporatesthesechangesandaccountsforcorrelationbiasispresented.
Finally,syntheticestimationwasemployedtoproduceA.C.E.RevisionIIresults.SeeKostanich(2003)forasum-maryofthemethodology.CORRECTINGMEASUREMENTERRORINTHEREVISIONSAMPLESAspreviouslystated,theoriginalA.C.E.process(step1.above)failedtodetectsignificantnumbersoferroneouscensusenumerations(EEs).TheseundetectedEEs(onepartofmeasurementerrorintheA.C.E.)wereuncovered duringtheevaluationsoftheA.C.E.(step2.above).Ingeneral,theoriginalA.C.E.PersonInterview(PI)andPFU,theEFUinterview,theMES,andthePFU/EFUReview resultswereusedtocorrectformeasurementerrorinthe enumeration,residency,mover,andmatchstatusesforsubsamplesoftheFullA.C.E.,calledtheRevisionEandPsamples.Noadditionaldatawerecollectedinthismea-surementerrorcorrectionprocess.TheRevisionSamplesunderwentextensiverecodingusingallavailabledataindicatedabove.Thisrecodingincludedtheoriginalinterviewandmatchingresults,theevaluationinterviewandmatchingresults,aswellastherecoding doneforthePFU/EFUReview.TheA.C.E.RevisionIIrecodingoperationwasanextensionofthePFU/EFUReviewclericalrecoding,whichwasused toexaminediscrepanciesbetweenenumerationstatusinSectionIIChapter22-1SummaryofA.C.E.RevisionIIMethodologyU.S.CensusBureau,Census2000 theoriginalA.C.E.andtheEvaluationFollow-up(EFU).Giventheinformationavailable,therecodingthatwas doneonthe17,500casesintheReviewEsamplewas consideredtohavenegligibleerror,sincethesedatawere reviewedandrecodedbyexpertmatchersusingrules consistentwithcensusresidencerules.AnautomatedcodingalgorithmbasedonspecificresponsestothePFUandtheEFUquestionnaireswasused todetermineanappropriatecodeforeachcase.ThiswasdoneforboththePFUinterviewandtheEFUinterview.TheautomatedcodingalsoassignedaWhycodethat describedthereasonwhytheparticularcodewas assigned.Athree-stepprocesswasfollowedtoassignfinalcodestoeachcase: 1.Validation.Determine,forcategoriesofWhycodes,iftheautomatedcodingwasofhighqualitybasedon levelofagreementwiththeReviewdata.
2.Targeting.TargetonlythoseWhycodecategoriesthathadcodesproducedbyautomatedcodingthathadlowlevelsofagreementwiththeReviewdata.
3.Clericalcoding.ClericallyrecodeonlycasesinthetargetedWhycodecategories.Theclericalrecodingtookadvantageofhandwritteninterviewercomments.Ingeneral,casesdidnotgotoclericalreviewifboththePFUandEFUautomatedcodesagreed,themoverstatuses alsoagreed,andtheWhycodecategorywasdeemedtobeofhighenoughquality.AftertheA.C.E.RevisionIIrecodingoperationcorrectedforenumeration,residency,andmoverstatus,theresultsoftheMESwereusedtocorrectforfalsematchesand falsenonmatches.Somematchingerrorswerearesultof incorrectresidencystatuscodingandhadbeencorrectedaspartoftherecodingoperationdiscussedabove.Todeterminethecorrectmatchstatus,eachofthepossible combinationsofmatchstatuswasreviewedtodeterminetheappropriatematchstatusforeachtypeofcase.Ingen-eral,theMESmatchstatuswasassignedwhentherewere changesfromamatchtoanonmatchorchangesfromanonmatchtoamatch.ForothersituationsthematchstatusfromtheEFUcodingwasassigned.SeeKrejsaand Adams(2002)forfurtherdetails.ADJUSTMENTFORMISSINGDATAAswithallsurveydata,itisnotpossibletoobtaininter-viewsforallsamplecases,norisitpossibletoobtain answerstoallinterviewquestions.FortheFullA.C.E.EandPsamples,householdnoninterviewadjustmentswereusedtoadjustfornoninterviewedhouseholds.Inaddition, imputationmethodswereusedtoadjustformissingchar-acteristicssuchasageortenure,aswellasenumeration,residency,andmatchstatus.FortheA.C.E.RevisionIIwork,thesemissingdataadjustmentsfortheFullA.C.E.EandPsampleswereessentiallyunchangedfromthose usedtoproducetheMarch2001A.C.E.estimates.FortheRevisionEandPsamples,however,therewerethreenewtypesofmissingdatatodealwith:1.Noninterviewedhouseholds:RevisionP-samplehouse-holdsthatwereconsideredinterviewsintheA.C.E.Psample,butwereidentifiedasnoninterviewsinthe RevisioncodingbecauseitwasdeterminedthattherewerenovalidCensusDayresidents;2.RevisionE-orP-samplecaseswithunresolvedmatch,enumeration,orresidencystatusbecauseofincom-pleteorambiguousinterviewdata;3.RevisionE-orP-samplecaseswithconflictingenu-merationorresidencystatus.Thisoccurredwhencon-tradictoryinformationwascollectedintheA.C.E.PFUandtheEFUinterviewsanditcouldnotbedetermined whichwasvalid.HouseholdNoninterviewAdjustmentfortheRevisionPSampleFortheoriginalMarch2001A.C.E.estimates,thehouse-holdnoninterviewadjustmentgenerallyspreadtheweightsoftheFullP-samplenoninterviewedhousingunitsoverinterviewedhousingunitsinthesameblockcluster withthesamehousingunitstructuretype.Themethodol-ogyfortheRevisionP-samplehouseholdnoninterviewadjustmentforInterviewDaywasessentiallyunchanged fromthatusedfortheFullPsample.Therewas,however, animportantchangeforthenoninterviewadjustmentforCensusDayresidency.Aseparatecellwasdefinedfornewnoninterviewsduetowholehouseholdsofpersonsdeter-minedtobeinmoversornonresidentoutmoversbasedontherecodingthatwasdonetocorrectformeasurementerror.ImputationforRevisionE-orP-SampleUnresolvedCasesIntheFullA.C.E.Psample,personswithunresolvedCen-susDayresidencyormatchstatuscameaboutintwo ways.First,thepersoninterview(PI)maynothavepro-videdsufficientinformationformatchingandfollow-up.Second,thePersonFollow-up(PFU)maynothavecollectedadequateinformationtodetermineapersonsCensusDay residencystatusortheirmatchstatus.Theimputationmethoddifferedbyhowthecasecametobeunresolved.RevisionP-samplepersonswithinsufficientinformationformatchingandfollow-uptendedalsotohavehadinsuf-ficientinformationintheoriginalcodingoftheFullPsample,exceptforsomerarecodingchanges.Theseper-sonswithinsufficientinformationwerenotsentoutforanEvaluationFollow-upinterview.2-2SectionIIChapter2SummaryofA.C.E.RevisionIIMethodologyU.S.CensusBureau,Census2000 FortheRevisionPsample,theimputationofCensusDayresidencywasimproveduponbydefiningfinerimputation cellsthatincludedwhetherornotthehousingunitwas matched,notmatched,orhadaconflictinghousehold.
Theprobabilityofamatchwasimputedbasedonthe overallmatchrateforfivegroupsdefinedbymoversta-tus,housingunitmatchstatusasintheoriginalA.C.E.,
andalsoonconflictinghouseholdstatus.ForRevisionP-andE-samplepersonswhowereunre-solvedbecauseofambiguousorincompletefollow-up information,thesituationwasmorecomplicatedbecause thereweretwofollow-upinterviewstoconsider,thePFU andEFU.FortheFullEandPsamples,imputationcellswerebasedmostlyoninformationobtainedbeforeanyfollow-upwas conducted.FortheRevisionEandPsamples,imputationcellsreliedontheafterfollow-upinformation.Thischangewasthesinglemostimportantimprovementinthemiss-ingdatamethodology.ImputationforRevisionE-orP-SampleConflictingCasesWhentheA.C.E.PFUandEFUinterviewshadcontradictoryinformation,thecasewasassignedacodeofconflicting.
Allcasesdeterminedtobeconflictingbasedonthe automatedrecodingweresenttoanalystsforfurtherclericalreview.Byexaminingthehandwrittennotesofinterviewers,theanalystscouldoftendeterminewhichof theinterviewswasbetterandassignanappropriatecode.Thereweresomecaseswheretheinterviewsappearedtobeofequalquality,suchasbothrespondentswerehouse-holdmembersorbothrespondentswereofequalcaliberproxy.Fortheseconflictingcases,theinterviewsseemedequallyvalidbasedontheexpertiseoftheanalysts.
Therefore,probabilitiesof0.5wereimputedforcorrect enumerationforRevisionE-sampleconflictingcasesandforCensusDayresidencyforRevisionP-sampleconflicting cases.FURTHERSTUDYOFPERSONDUPLICATIONEarlierworkshowedthatcorrectingmeasurementerrorbyrecodingwasnotgoingtocorrectallthemissederrone-ousenumerations.EvaluationsoftheMarch2001A.C.E.coverageestimatesindicatedtheA.C.E.failedtodetectalargenumberoferroneouscensusenumerations.Onetype ofcensuserroneousenumerationswasduplicatecensusenumerations;thatis,censusenumerationsincludedinthecensustwoormoretimes.TheA.C.E.wasnotspecifi-callydesignedtodetectduplicatecensusenumerations beyondtheA.C.E.searcharea(theareawherecensusandA.C.E.personmatchingwasconducted).However,therewasanexpectationthattheA.C.E.woulddetectthatthese E-sampleenumerationshadanotherresidenceandthatroughlyhalfthetimethisotherresidencewastheusualresidence.Feldpausch(2001)showedthisexpectationwas notmet.ForpurposesofconstructingA.C.E.RevisionIIestimates,thestudyofpersonduplicationusedmatchingandmodel-ingtechniquestoidentifyduplicatelinksbetweentheFull EandPsamplestocensusenumerations.Linkstogroup quarters,reinstated,deletedandE-sampleeligiblerecords throughouttheentirenationwereallowed.Thematching algorithmusedstatisticalmatchingtoidentifylinked records.Statisticalmatchingallowedforthematching variablesnottobeexactonbothrecordsbeingcompared.
Becauselinkedrecordsmaynotrefertothesameindi-vidualevenwhenthecharacteristicsusedtomatchthe recordswereidentical,modelingtechniqueswereusedto assignameasureofconfidence,theduplicateprobability, thatthetworecordsrefertothesameindividual.MatchingAlgorithmThematchingalgorithmconsistedoftwostages.Thefirststagewasanationalmatchofpersonsusingstatisticalmatching.Statisticalmatchinglinksrecordsbasedonsimi-larcharacteristicsorcloseagreementofcharacteristics.
Statisticalmatchingallowedtworecordstolinkinthepresenceofmissingdataandtypographicalorscanningerrors.Thesecondstageofmatchingwaslimitedto matchingpersonswithinhouseholdsthatcontainedalink fromthefirststage.Thesecondstageofmatchingwaslimitedtomatchingpersonswithinlinkedhouseholds.Thefirststageestab-lishedalinkbetweentwohousingunits.Thesecondstage wasastatisticalmatchofallhouseholdmembersinthe samplehousingunittoallhouseholdmembersinthecen-sushousingunit.ModelingTechniquesThesetoflinkedrecordsconsistsofbothduplicatedenu-merationsandpersonrecordswithcommoncharacteris-tics.Usingtwomodelingapproaches,theprobabilitythatthelinkedrecordswerethesamepersonwasestimated.Oneapproachusedtheresultsofthestatisticalmatching andreliedonthestrengthofmultiplelinkswithinthehouseholdtoindicatepersonduplication.Thesecondreliedonanexactmatchofthecensustoitselfandthe distributionofbirths,names,andpopulationsizetoindi-cateiftheindividuallinkwasaduplicate.Thesetwoapproacheswerecombinedtoyieldanestimatedduplicateprobabilityforthelinkedrecordsfromthestatistical matchingoftheFullEandPsamplestothecensus.SeeChapter5forafulldiscussiononthepersonduplicationstudy.THEA.C.E.REVISIONIIDSEFORMULAWiththecorrectionofmeasurementerrorintheRevisionEandPsamples,theadjustmentformissingdataintheRevisionEandPsamples,andthedeterminationofcensusSectionIIChapter22-3SummaryofA.C.E.RevisionIIMethodologyU.S.CensusBureau,Census2000 duplicatelinksbetweentheFullEandPsamplesandcen-susenumerations,thedualsystemestimationformulacan beapplied.Thefollowingsectionsexplaintheformulaand itsadjustmentfortheA.C.E.RevisionIIwork.UsingprocedureCformoversanddifferentpost-stratafortheEandPsamples,theDSEformulacanbewrittenas:
DSE C ijC en'ijII'ij[CE i E i][M nm , j[M om , j P om , j]P im , j P nm , jP im , j]TheA.C.E.RevisionIIDSEformulausingprocedureCformovers,separateEandPpost-strata,measurementerrorcorrectionsfromtheEandPRevisionSamples,anddupli-catestudyresultsis:
R e DSE C ijC en'ijII'ij[CE i ND f 1, i'CEi D E i][M nm , j ND f 2, j'Mnm , j D[M om , j f 3, j'P om , j f 4, j']P im , j f 5, j'gP nm , j D-Pnm , j DP nm , j ND f 6, j'Pnm , j DP im , j f 5, j'gP nm , j D-Pnm , j D]RecallthattheII'termexcludesthelatecensusadds.
NotationTerms CECorrectenumerations EE-sampletotal M Matches PP-sampletotal fAdjustsformeasurementerror gAdjustsnonmoverstomoversdueto duplication Subscriptsi,jFullEandPpost-stratai',j'RevisionEandPmeasurementerrorcorrectionpost-stratanm,om,imnonmover,outmover,inmover Superscripts CDSEProcedureCformovers NDNotaduplicatetocensusenumerationoutsidesearcharea DDuplicatetocensusenumerationoutsidesearch areaIncludesprobabilityadjustmentforresidencygivenduplicationAdjustmentforDuplicatesusingtheDuplicate StudyThefirsttaskwastoadjusttheusualdualsystemestimateformulaforthosecasesthathavealinktoacensusenu-merationoutsidetheA.C.E.searcharea.P-andE-sample caseswithlinkstocensusenumerationswereassignedanonzeroprobabilityofbeingaduplicate.P-andE-samplecaseswithoutduplicatelinkswereassignedaprobability ofzero.WhenestimatingtermsintheA.C.E.RevisionIIDSEinvolv-ingnonduplicates,thoseindicatedbyasuperscriptND,itwasnecessarytoincludetheprobabilityofnotbeingaduplicateinthetallies.Thisprobabilityofnotbeinga duplicatewasincludedinallofthetermsinvolvingtheND
superscript.AlthoughtheduplicatestudyidentifiedE-andP-samplecaseslinkingtocensusenumerationsoutsidetheA.C.E.
searcharea,thisstudycouldnotdeterminewhichcompo-nentofthelinkwasthecorrectonesincetherewerenoadditionaldatacollectedtodeterminethis.OntheE-sampleside,thisstudydoesnotidentifywhetherthe linkedE-samplecaseisthecorrectenumeration.OntheP-Sampleside,thisstudydoesnotidentifywhetherthelinkedP-samplecaseisaresidentonCensusDay.Thus,it wasnecessarytoestimatetwoconditionalprobabilities,whicharereflectedfortheEsamplein CEi D.InthePsample,theseprobabilitiesarereflectedinthenonmover terms Pnm , j D and Mnm , j D.AdjustmentforMeasurementErrorUsingtheRevisionEandPSamplesNext,anadjustmentismadeforothermeasurementerrorsnotaccountedforbytheduplicatestudy.Thisadjustmentwasappliedonlytononduplicatetermstoavoidover-correctionduetoanyoverlapbetweenthe duplicatestudyandcorrectionofmeasurementerror.InsupportoftheA.C.E.RevisionIIprogram,theRevisionSampleshaveundergoneextensiverecodingusingall availableinterviewdataandmatchingresults.Missing dataadjustmentshavealsobeenappliedtotheRevisionSamples.ThisrecodeddatafromtheRevisionSampleswereusedtocorrectformeasurementerrorintheoriginal FullEandPsamples.TheratioadjustmentsthatcorrectformeasurementerrorwerebasedontheEorPRevisionSampleandwerearatio ofanestimateusingtheRevisioncodingtotheestimateusingtheoriginalcoding.Theseadjustmentsweredonebymeasurementerrorcorrectionpost-strata i'or j'andaredenotedbythe ftermsintheA.C.E.RevisionIIDSEfor-mula.Theterm gadjuststhenumberofinmoversforthoseFullP-samplenonmoverswhoaredeterminedtobenonresi-dentsbecauseofduplicatelinks.Someofthesenonresi-dentsarenonresidentsbecausetheyareinmoversandshouldbeaddedintothecountofinmovers.Theterm P nm , j D-Pnm , j Disanestimateofnonresidentsamongnonmov-erswithduplicatelinks.AdjustmentforCorrelationBiasUsingDemographicAnalysisNext,theA.C.E.RevisionIIDSEestimatesareadjustedtocorrectforcorrelationbias.Correlationbiasexistswhen-evertheprobabilitythatanindividualisincludedinthe2-4SectionIIChapter2SummaryofA.C.E.RevisionIIMethodologyU.S.CensusBureau,Census2000 censusisnotindependentoftheprobabilitythattheindi-vidualisincludedintheA.C.E.Thisformofbiasgenerallyhasadownwardeffectonestimates,becausepeople missedinthecensusmaybemorelikelytoalsobemissed intheA.C.E.Estimatesofcorrelationbiasarecalculated usingthetwo-groupmodelandsexratiosfromDemo-graphicAnalysis(DA).Thesexratioisdefinedasthenum-berofmalesdividedbythenumberoffemales.This modelassumesnocorrelationbiasforfemalesorfor malesunder18yearsofage;andthatBlackmaleshavea correlationbias,whichisdifferentthantherelativecorre-lationbiasfornon-Blackmales.Thecorrelationbias adjustmentisalsodonebythreeagecategories:18-29, 30-49,and50andover.Thismodelfurtherassumesthat relativecorrelationbiasisconstantovermalepost-strata withinagegroups.TheRace/HispanicOriginDomainvari-ableisusedtocategorizeBlackandnon-Black.TheDAtotalsareadjustedtomakethemcomparablewithA.C.E.Race/HispanicOriginDomains.BlackHispanicsare subtractedfromtheDAtotalforBlacksandaddedtothe DAtotalfornon-Blacks.ThisisdonebecausetheA.C.E.assignsBlackHispanicstotheHispanicdomain,nottheBlackdomain.Thesecondadjustmentdeletesthegroup quarters(GQ)peoplefromtheDAtotalsusingCensus2000data.ThereasonformakingthisadjustmentisthattheGQpopulationisnotpartoftheA.C.E.universe.A finaladjustmentthatcouldhavebeenmadewouldhavebeentoremovetheremoteAlaskapopulationfromtheDAtotals,sinceittooisnotpartoftheA.C.E.universe.Sincethispopulationissmall,theDAsexratioswouldnotbe affectedinanymeaningfulway.SeeU.S.CensusBureau (2003)fortechnicaldetails.SYNTHETICESTIMATIONThecoveragecorrectionfactorsfordetailedpost-strataijwerecalculatedas:
CCF ijRe DSE C ij C en ijwhere: Re DSE ij CsarethecorrelationbiasadjustedDSEsfor post-strata ij.C en ijsarethecensuscountsforpost-strata ij ,includinglatecensusadds.Acoveragecorrectionfactorwasassignedtoeachpost-stratum.Thepost-strataexcludedpersonsingroupquar-tersorinremoteAlaska.Effectively,thesepersonshaveacoveragecorrectionfactorof1.0.Indealingwithduplicatelinkstogroupquarterspersons,thepersoninthegroup quarterswastreatedasif(s)hewasacorrectenumeration orasifthiswastheircorrectresidenceonCensusDay.Asyntheticestimateforanyareaorpopulationsubgroupbisgivenby:
Nbijb C en b , ij CCF ijSectionIIChapter22-5SummaryofA.C.E.RevisionIIMethodologyU.S.CensusBureau,Census2000 Chapter3.CorrectingDataforMeasurementError INTRODUCTIONTheoriginalA.C.E.estimateswerefoundtobeunaccept-ablebecausetheyfailedtodetectsignificantnumbersof erroneouscensusenumerations.Therewerealsosuspi-cionsthattheA.C.E.mayhaveincludedresidentsinitsPsamplethatwereactuallynonresidents.Thus,themajorgoalfortheA.C.E.RevisionIIestimatesincludesacorrec-tionofthesemeasurementerrors.Oneaspectofthese correctionsinvolvescorrectingasubsampleoftheA.C.E.data.Anotheraspectinvolvescorrectingmeasurementerrorsthatcannotbedetectedwiththeinformationavailableinthesubsample.Theseadditionalerrors,which areidentifiedviaaduplicatestudy,arediscussedinChapter5.Tounderstandthemeasurementerrorcorrectionprocess,itisimportanttobefamiliarwiththevarioussourcesofavailableinformation.Thesearesummarizedinthefollow-ingtable.TheA.C.EestimatesproducedinMarch2001werebasedontheFullEandPsamples,whichareprobabilitysamplesofover700,000personsin11,303blockclusters.TheMatchingErrorStudy(MES)andtheEvaluationFollow-up (EFU)weretwoprogramsthathadbeenplannedtoevalu-atetheMarch2001A.C.E.estimates.Theseevaluationswereconductedinasubsampleof2,259blockclusters selectedfromtheoriginal11,303blockclusters.AfurthersubsampleofpersonswithintheseblockclusterswasdonefortheEFUevaluation.TheprobesusedforEFUwere designedtocaptureunusuallivingsituations.The PFU/EFUReviewwasnotpartoftheplannedevaluations.ItwasconductedinordertoresolvemajordiscrepanciesinenumerationstatusbetweentheEFUandPFUresults.
Thus,theReviewEsampleisasubsampleoftheEFUEsample.TheRevisionEandPsamplesarereferredtoassuchforpurposesofproducingA.C.E.RevisionIIesti-mates.ThesesamplesareessentiallythesameastheEvaluationEandPsamplesforEFU,butthedatahaveundergoneamajorrecodingtocorrectformeasurementTable3-1.OverviewofA.C.E.RevisionIIDataSourcesProgramSampleSamplesizeWhat&whenDecennialcensusSpring2000A.C.E.FullEandPsamplesE&P:About700,000personsin11,303blockclustersA.C.E.PersonInterviewing(PI),Summer2000A.C.E.PersonFollow-up(PFU),Fall2000MatchingErrorStudy (MES)EvaluationEandPsamplesE&P:About170,000personsin2,259blockclustersRematchingOperation,December2000EvaluationFollow-up (EFU)EFUEandPsamples 1E:About77,000personsin2,259blockclustersEvaluationPersonFollow-up(EFU),January-February,2001P:About61,000personsin2,259blockclustersPFU/EFUReviewReviewEsampleE:About17,500personsin2,259blockclustersRecodingOperation,Summer2001A.C.E.RevisionIIRevisionEandPsamplesE:About77,000personsin2,259blockclustersRecodingOperation,Summer2002P:About61,000personsin2,259block clusters 1ThenumberofsamplecasesincludedintheEvaluationFollow-upislessthanthoseselectedtobeinthissample.Caseswereexcludedfromfollow-upforcertainsituationssuchasinsufficientinformationoraduplicateenumeration.SectionIIChapter33-1CorrectingDataforMeasurementErrorU.S.CensusBureau,Census2000 error.Thischapterdiscussesthemeasurementerrorcor-rectionsmadetotheE-andP-Revisionsamples.These correcteddata,alongwithothermeasurementerrorcor-rectionsidentifiedbytheduplicatestudy,wereusedto adjusttheFullEandPsamplestoproduceA.C.E.
RevisionIIestimates.GOALSANDBACKGROUNDThegoalforA.C.E.RevisionIIwastocorrectasmuchmea-surementerroraspossibleintheoriginalA.C.E.estimates,givenresourceandtimingconstraints.
2Theprimarysourcesofmeasurementerrorweredeterminingresidence andenumerationstatus,matchstatus,andmoverstatus.ResidenceandEnumerationStatus.TheoriginalA.C.E.didnotdetectalloftheerroneousenumerations.SeeAdamsandKrejsa(2001)andFay(2002)fordocumen-tation.TheEvaluationFollow-up(EFU)detectedapproxi-mately1.4millionadditionalerroneousenumerationsintheEsample.Sincethecodingofenumerationstatusin theEsamplewasidenticaltothecodingofresidencesta-tusinthePsample,similarresultsforP-sampleresidencestatuscodingwereexpected(i.e.,additionalnonresidents wereexpectedtobefoundasaresultoftheEFU).Tocor-rectfortheresidencestatuserrors,theA.C.E.RevisionIIutilizedarecodingoftheEvaluationFollow-upInterviewincombinationwiththeoriginalA.C.E.todeterminethebest residenceorenumerationstatusforeachpersonintheRevisionsample.MatchingError.TheMatchingErrorStudyshowedanetdifferenceinmatchcodesbetweentheoriginalMarch2001matchingresultsandtheevaluationmatching resultsof0.41percentintheEsampleand0.20percentinthePsample.Bean(2001)suggestedthisnetdifferencetranslatedintoanincreaseinthedualsystemestimateof 483,938people.Tocorrectformatchingerror,resultsof theMatchingErrorStudyandtheA.C.E.RevisionIIrecod-ingwereusedinconjunctiontodeterminetheappropriatematchstatusforeachperson.MoverStatus.RaglinandKrejsa(2001)estimateda2.6percentgrossdifferencerateinthemoverstatusbetween theoriginalA.C.E.andtheEvaluationFollow-up.Thistranslatedintoanegativebiasof465,000intheDSE(assumingnootherbiases).ResultsoftheEvaluation Follow-upwereusedtocorrectformoverstatuserrors.
TheEFUquestionnairecontainedquestionsdesignedtoprobeforapersonsmoverstatus.Thisinformationwascapturedduringtheclericalrecodingandduringtheinitial codingoftheEvaluationFollow-upform.Thesetypesofmeasurementerrorswerecorrectedeitherbycomputerorclerically.Twoothersourcesoferrorwerenotpartofthemeasure-menterrorrecodingportionoftheA.C.E.RevisionII.
Theseerrorsincludedgeocodingerrorsandduplicates outsidethesearcharea.Certaingeocodingerrorsdetected byvariousgeocodingevaluationswerenotincludedinthe A.C.E.RevisionII.
3WithinthePsample,245,926produc-tionnonmatchedresidentswerefoundoutsidethesearch area 4and195,321productioncorrectenumerationsintheEsamplewerefoundoutsidethesearcharea.SeeAdamsandLiu(2001).Someofthecorrectenumerationsoutside thesearchareawereidentifiedbytheEFUinterviewand,hence,werereflectedintherevisedcoding.
5 Duplicatesfoundoutsidethesearchareaasaresultofcomputer matching(seeChapter5)werenothandledbyclericalcod-ing.Theywereaccountedforinthedualsystemestimatorusingestimationtechniques.SeeChapter6forafull descriptionoftheestimationtechniques.RESIDENCESTATUSANDENUMERATIONSTATUSAsalreadynoted,theoriginalMarch2001A.C.E.underes-timatedthenumberoferroneousenumerations.Tocorrectforthis,thebestresidencestatuscodewasbasedonavailablefieldfollow-updata.Duplicateswerecorrected usingaseparateprocess.Thefollowingdatawereavail-ableformeasurementerrorcorrection:
- PersonInterview(PI).ThePIwastheoriginalA.C.E.enumerationofthePsample.ItwasaComputer-AssistedPersonalInterviewquestionnairedesignedtofullyenumeratepersonsintheA.C.E.Itwasconducted byeitherphoneorpersonalvisitbetweenApriland September,2000.
- PersonFollow-up(PFU).ThePFUwasthefollowupusedtoassignresidenceandenumerationstatus,when-everthoseitemswerenotdetermined,afterthebeforefollow-upmatching(Childers,2001).ItwasconductedbypersonalvisitinOctoberandNovember,2000, approximately6-7monthsafterCensusDay.
- EvaluationFollow-up(EFU).TheEFUwasanevalua-tionoftheA.C.E.designedtodetectunusuallivingsitu-ationsusingadditionalprobesandadditionalinterview-ingtechniques(e.g.,flashcards).ItwasconductedbypersonalvisitinJanuaryandFebruary,2001,approxi-mately9-10monthsafterCensusDay.
2InordertocompletetheA.C.E.RevisionIIestimatesontime,12weekswereallottedforcoding.AnalystsattheNationalPro-cessingCenterwereexpectedtocodeapproximately25,000casesinthistimeframe.
3AspartoftheA.C.E.,severalevaluationsofgeocodingerrorwereconductedonvarioussubsamplesoftheA.C.E.,mostnota-blyTargetedExtendedSearch2(TES2)andTargetedExtended Search3(TES3).ResultsoftheseevaluationscanbefoundinAdamsandLiu(2001).
4Forthe2000A.C.E.,thesearcharea,orareainwhichaper-soncanbeconsideredacorrectenumerationormatch,wasthe clusterandanycensusblocktouchingthecluster.
5SomeofthecasesinTES2wereevaluatedusingtheEvalua-tionFollow-upquestionnaire.Forthesecases,resultsofthegeoc-odingevaluationwereincludedintheEvaluationFollow-up;how-ever,ifacasewasinTES2,butnotintheEvaluationFollow-up,nogeocodingevaluationresultswereincluded.3-2SectionIIChapter3CorrectingDataforMeasurementErrorU.S.CensusBureau,Census2000 ResultsofthePersonInterviewwereusedtoassignA.C.E.residencestatusbycomputertoallpeopleinA.C.E.who didnotneedfollow-up.Incontrast,thePFUwasusedto assignresidencestatusforanyonewhowaseligiblefor follow-up(Childers,2001).ThePFUissimilartothePI.
ThePFUprocessinterviewedbothP-sampleandE-sample people.TheEFUfollowedupasampleofpeoplesentto PFUandasampleofthosenotsenttoPFU.Thisallowed theresidence/enumerationstatusofarepresentative sampleofpeopleeligibleforfieldfollowuptobe
evaluated.ThereweremeasurementerrorsinboththeA.C.E.PFUandEFUresultingfromlimitationsoftheirrespectiveinter-views.TheseerrorsaredocumentedinBean(2001)andAdamsandKrejsa(2001),respectively.Also,theEFUwas notstrictlycodedaccordingtocensusresidencerules.ToevaluatetheEsampleforESCAPII,theCensusBureaucon-ductedthePFU/EFUReviewinthesummerof2001.Expert matchersreviewedasubsampleoftheEFUEsampleand appliedconsistentcensusresidencyrules.Theseanalystswereassumedtomakenegligibleerrors;therefore,thePFU/EFUReviewwasconsideredtobefreeofcodingerror, givenavailabledata.ForA.C.E.RevisionII,thishigh-qualitycodingwasneededforsubsamplesoftheA.C.E.PandEsamplesthatwerelargeenoughtoprovideaccuratesubgroupestimatesof netcoverage.Twelveweekscodingtimewereallottedtoclericallycodeapproximately25,000cases.However,therewereover100,000casesneedingcodes.Toassign thehighestqualitycodes,whilemeetingademanding schedule,keyeddatafromboththePFUandEFUformswereusedtoaugmentclericalcodingprocedures.Anautomatedcodingalgorithm,basedonspecificresponses tothePFUandEFUquestionnaires,wasusedtodetermineanappropriatecodeforeachcase.ThiswasdoneforboththePFUinterviewandtheEFUinterview.Theautomated codingalsoassignedaWhycodethatdescribestherea-sonwhytheparticularcodewasassigned.Thereweremorethan60possibleWhycodecategories.Afinalcode wasassignedtoeachcaseusingthefollowingthree-step process:*Validation.DetermineforeachcategoryofWhycodeiftheautomatedcodingisofhighqualityusingthePFU/EFUReviewasatruthdeck.
- Targeting.TargetonlythoseWhycodecategoriesthathavelowlevelsofagreementbetweentheautomatedcodingandthePFU/EFUReviewdata.
- ClericalReview.ClericallyrecodeonlythosecasesinthetargetedWhycodecategories.Theclericalrecoding takesadvantageofhandwritteninterviewercomments.ValidationofKeyedDataTovalidatethequalityofcodingproducedbythekeyeddataalgorithm,skippatternsforbothquestionnaireswere programmedtodetermineanappropriatematchcodeandWhycodeforeachcase.Then,forboththePFUandEFUforms,thepercentageagreementwiththeoriginalcoding (eitherproductioncodingorthecodingoftheEFUform) fortherespectiveform,thepercentageagreementwith thePFU/EFUReview,andtheresidualriskwereexamined.
Thatis,thefollowingcalculationswereperformedtwice-onceforPFUandonceforEFU.Theresidualriskofdisagreement(i.e.potentialbias)rep-resentedthenumberofcasesatriskforbeingcoded wrongduetoacceptingtheautomatedcodeforcategoriesdefinedbyquestionnaireresponses.Casessubjecttoriskwerethosewheretheautomatedcodeandoriginalcode agreed.Iftheydisagreed,theautomatedcodewasrejectedandthecasewassentforclericalreview.Therisk forthecasesagreeingiscalculatedasfollows:risk=Agree KAgree R evwhereAgree K=Theweightednumberofcaseswhosecodefromthekeyeddataagreedwiththeoriginalproduc-tioncode.Agree R ev=Ofthosecaseswherethecodefromthekeyeddataagreedwiththeoriginalproductioncode,theweightednumberofcaseswhosecodefromthekeyeddataagreedwiththePFU/EFUReviewcode.Thetermrisk,ratherthananerror,isusedbecausesomepotentialcodingchangesmaynothavehadaneffectontheDSE.Forexample,peoplewhowereingroupquartershavearesidualriskof26,517aftercomputercoding.
Theserepresentcasesthatprobablyshouldhavebeencodedaserroneousenumerations,butwerenot.However,someofthe26,517casescouldbeunresolved,which haveaprobabilitylessthanoneofbeingcorrect.TheautomatedcodingresultsforagivenWhycodecat-egorywererejectediftheresidualriskwastoohighoriftherewerenotenoughcasestomakeaninformeddeci-sion.Theexceptiontothisrulewasthecategoryconsist-ingofcaseswithoutanyindicationoflivinginagroup quartersorotherresidence.Thisgroupwas,byfar,thelargestcategoryforboththePFUandEFU,soahigherresidualrisk 6wasexpected.TargetingCasesforClericalReviewAfterthedecisionwasmadetoacceptorrejecttheauto-matedcodeforeachWhycodecategory,casesweretar-getedforclericalreview.Analysts,whowerethehighest levelofclericalmatchers,performedtheclericalreview.
Duetotheirexperienceandadditionaltraining,theywereassumedtomakenegligibleerrorsincoding.
6Absoluterisk,ratherthanrelativerisk,isused.Therefore,largercategoriestendedtohavehigherrisks.SectionIIChapter33-3CorrectingDataforMeasurementErrorU.S.CensusBureau,Census2000 Ingeneral,casesdidnotgotoclericalreviewifboththePFUandEFUautomatedcodesagree,themoverstatuses agree,andtheWhycodecategorywasdeemedtobeof highenoughquality.Insomeinstances,casesareexempt fromclericalreviewbecausetheycouldbecodedbased oninformationavailableindatafiles.Formanyofthese situations,consistentandcompletedatawereobtained fromboththePFUandEFUinterviews.Thesecases
included:*CensusUsualHomeElsewhere.IfthepersonclaimedaUsualHomeElsewhereoncertaintypesofcensusforms,theywerecountedasacorrect 7 enumera-tionwithintheclusteranddidnotneedclericalreview.
- GeocodingErrorsfromInitialHousingUnit Matching.IfacaseshouldnothavebeensenttoPFUorEFUandwasonlysentduetoclericalerrorintheini-tialproductionmatching,thenitdidnotneedclericalreview.Incontrast,somecasesareautomaticallysenttoclericalreview.Forexample,thisincludescasesinthePFU/EFUReviewthatresultedinaconflictingstatus,noninterviewcases,orcaseswheremoverdatescouldnotbedeter-minedfromtheEFUkeyeddata.SomeofthecasesthatwenttoclericalreviewdidsobecausetheoriginalA.C.EorPFUresultsdidnotagreewiththeEFUresults.Mostofthe caseswenttoclericalreviewbecausetheautomatedcod-ingprocesswasnotreliableforthatWhycodecategory.ForP-sampleinmovers,therewasnovalidationdata.CaseswheretheoriginalEFUmoverstatusdidnotmatchthemoverstatusfromthekeyeddata,ortheresidencestatusfromthekeyeddatadidnotmatchtheoriginalEFU residencestatus,weresenttoclericalreview.Noninter-viewcasesorcaseswheremoverdatescouldnotbedeterminedfromthekeyeddatawerealsosenttoclericalreview.Caseswiththefollowingattributesweresenttoclericalreview:*thecodefromthekeyeddataforeitherformwasnotacceptedforthatcase.*thecodefromthekeyeddatawasacceptedforbothforms,butatleastoneofthecodesfromthekeyeddatadidnotagreewithitsoriginalcode(i.e.,thePFUcode fromthekeyeddatadidnotagreewithproductionor theEFUcodefromthekeyeddatadidnotagreewiththeoriginalEFUcode).*forP-samplepeople,themoverstatusfromthekeyeddatadidnotagreewithmoverstatusassignedduring theEFUcoding.*therewaswrite-ininformationinopen-endedquestionsontheformthatcouldnotbecoded.*thecasewasapossiblematchinbeforefollow-upmatchingandtheproductionandoriginalEFUcodedis-agreed.*thecasewasaduplicateineithertheoriginalEFUcod-ingorproductionafterfollow-upcoding.*thecasewasnotyetflaggedforclericalreviewandthePFUcodefromthekeyeddatadidnotagreewithEFUcodefromthekeyeddata,andoneofthecaseswasnotunresolvedforcertainreasons.*thecasewasinthePFU/EFUReviewandwasconflictingorhadamoverstatusdisagreementbetweenthekeyed dataandtheoriginalEFUmoverstatus.ClericalReviewTheclericalreviewforA.C.E.RevisionIIwasananalyst-onlyoperation.Thefollowingdatawerecollected:*MatchCodeforeachform
- WhyCodeforeachform*Respondentforeachform*Whethertherespondentsarethesameforthetwointer-views*BestCode.Acodeindicatingwhichformisthebetterofthetwoforms*SmooshedCode.Informationfrombothformscom-binedtomakeacodetorepresentthetruesituation*MoverStatus.MoverStatusfromtheEFUformforP-samplepeopleThematchcodeswereassignedusingthecensusresi-dencerulestoconstructcodingrulesfortheflowofthe questionnaire.Thebestcodecouldbeoneoffourvalues:*Both=Theenumerationstatuseswerethesame*PFU=ThePFUformprovidedbetterinformation*EFU=TheEFUformprovidedbetterinformation*Conflicting=Similarcaliberrespondents(e.g.,husbandandwife;twoneighbors)providedcontradictoryinfor-mationforthecaseToensurereproducibility,computereditswereappliedtothebestcode.Iftheanalystdidnotfollowpre-specified rules,thentheanalysthadtoreviewthecaseagainorleaveanoteindicatingthesituation.
7Apersoncanclaimausualhomeelsewhereifheorsheisenumeratedoncertaintypesofcensusformsingroupquarters (e.g.military,shipboard,andcertaintypesofspecialplaceslike shelters).Ifapersonononeoftheseformsclaimsausualhomeelsewhere,thenthatpersoniscountedattheaddresstheyindi-cateistheirusualhome.ThesepeoplearepartoftheEsample becausetheyarepartofthehousingunituniverse.3-4SectionIIChapter3CorrectingDataforMeasurementErrorU.S.CensusBureau,Census2000 CORRECTIONOFMOVERSTATUSASSIGNMENT ERRORSForeachP-samplecase,moverstatuswasbasedontheEFU.Thiswasusedtodeterminewhetherornottheper-sonneededclericalreview.CORRECTIONOFMATCHINGERRORSAftertheA.C.E.RevisionIIrecodingoperationcorrectsforenumeration,residence,andmoverstatus,theresultsoftheMatchingErrorStudy(MES)wereusedtocorrectforfalsematchesandfalsenonmatches.Somematching errorswerearesultofincorrectresidencestatuscodingandhavebeencorrectedaspartoftherecodingoperationdiscussedabove.Todeterminethecorrectmatchstatus, eachofthepossiblecombinationsofmatchstatuswas reviewedtodeterminetheappropriatematchstatusforeachtypeofcase.Ingeneral,theMESmatchstatuswasassignedwhentherewerechangesfromamatchtoanon-matchorchangesfromanonmatchtoamatch.Forothersituations,thematchstatusfromtheEFUcodingwas assigned.DATAOUTPUTSAftertheclericaloperationwascompleted,twofileswereassembled-oneforthePsampleandanotherfortheE sample.ThefilescontainmatchcodesandWhycodes(whereappropriate)fororiginalMarch2001A.C.E.,EFU,PFU/EFUReview,KeyedData,andA.C.E.RevisionIICleri-calReview.Afinalcodeisalsoassignedinthefollowinghierarchy:A.C.E.RevisionIIClericalReview,PFU/EFUReview,KeyedData.Thiscodereflectsthefinalmatch, residence,andenumerationstatusfortheA.C.E.Revision IIprocess.LIMITATIONSTherewereseverallimitationsonthedatafortheA.C.E.RevisionII:
- SampleSize.Thesampleusedtoestimatemeasure-menterroris2,259clusters,containingabout10per-centofthepersonsinthesampleusedintheproduc-tionA.C.E.Duetothesmallersamplesize,somesubgroupestimatesaresubjecttohighervariancescomparedtothosefortheoriginalMarch2001A.C.E.
- ConflictingCases.ConflictingcasesoccurredwhenthePFUandEFUinterviewshadrespondentsofthe samecaliber(eitherbothnonproxyorproxyrespon-dentswhowereinthepositiontohavesimilarknowl-edgeaboutthehousehold,e.g.twoneighbors)who gavecontradictoryinformation.Sinceanadditionalfieldfollow-upwasnotpossible,thesecaseswerecodedasconflicting,werereviewedseparately,andimputed.
- DataCollectionError.Caseswerecodedasbestaspossible.However,therewasnoattempttocorrectforanyresidualdatacollectionerror.Anyremainingrespon-dentandinterviewererrorscouldnotberectifiedwith-outanadditionalfieldfollow-up.SectionIIChapter33-5CorrectingDataforMeasurementErrorU.S.CensusBureau,Census2000 Chapter4.A.C.E.RevisionIIMissingDataMethods BACKGROUNDMissingdataarisesbecauseitisnotpossibletoobtaininterviewsforallsamplecasesortoobtainanswerstoallinterviewquestions.ThiswasastruefortheA.C.E.Revi-sionII,asitwasfortheA.C.E.ToputtheA.C.E.RevisionIImissingdatamethodsinperspective,abriefsummaryoftheA.C.E.missingdataadjustmentsispresented.Forthe A.C.E.Psample,ahouseholdnoninterviewadjustmentcompensatedfornoninterviewedhouseholds.Imputation methodswereimplementedtohandlemissingcharacteris-ticssuchasageortenure.Further,matchandresidencyprobabilitieswereassignedwhentherespectivematchandresidencystatusescouldnotbedefinitivelydeter-mined.Therewasnononinterviewadjustmentforthe A.C.E.Esample,norwasthereanimputationformissingcharacteristicsasthecensusimputationswereused.How-ever,E-samplecaseswithunresolvedenumerationstatus wereassignedprobabilitiesofcorrectenumeration.SeeIkedaandMcGrath(2001)fordetailsontheA.C.E.missingdatamethodology.AswillbediscussedinChapter6,theA.C.E.RevisionIIestimationutilizesboththeoriginalA.C.E.codingresultsontheFullEandPsamplesandtheRevisioncodingresultsonthesmallerRevisionSamples.Notethatthe A.C.E.RevisionIIsubsampleoftheA.C.E.isreferredtoastheRevisionSampleandthenewcodingoperationiscalledtheRevisioncoding.Themissingdataadjustments fortheA.C.E.EandPsampleswereunchangedfromthose usedtoproduceA.C.E.estimates,withtheexceptionoftheimputationformissingage.ItwasnecessarytoimputeageagainfortheFullA.C.E.PsamplebecausetheA.C.E.
RevisionIIpost-stratahaddifferentagegroupings.TheRevisionPsampleusedthesameimputationsformissingcharacteristicsthattheA.C.E.did,includingthenewageimputation.However,sinceA.C.E.RevisionII measurementmethodologyhadimportantdifferencesfromtheA.C.E.measurementmethods,itwasnecessarytodevelopnewmissingdatamethods.TheA.C.E.Revision IImissingdataconfrontedthreegeneraltypesofnew missingdataproblems:1.Newnoninterviewedhouseholds:RevisionP-samplehouseholdsthatwereconsideredinterviewsinthe A.C.E.wereidentifiedasnoninterviewsintheRevision codingwhenitwasdeterminedthatnoneoftheP-samplepeopletherewerevalidCensusDayresi-dents.2.RevisionE-andP-samplecaseswithunresolvedmatch,enumeration,orresidencystatus,becauseofincompleteorambiguousinterviewdatafromthePer-sonFollow-up(PFU)ortheEvaluationFollow-up(EFU).3.RevisionE-orP-samplecaseswithconflictingenu-merationorresidencystatusbecausecontradictoryinformationwascollectedinthePFUandtheEFUinter-viewsanditcouldnotbedeterminedwhichwasvalid.AGEIMPUTATIONFortheoriginalA.C.E.,P-samplepeoplewithmissingagewereassignedtoagecategoriesdefinedbythepost-stratificationplan.TheA.C.E.RevisionP-samplepost-stratificationdividedtheoriginalA.C.E.post-stratification groupof0-17yearoldsintotwoagegroups:0-9and10-17.Thosepeoplewithmissingagewhohadbeenassignedtothe0-17groupwerereassignedtoeitherthe 0-9orthe10-17group.Thisreassignmentassumedthattheagedistributionofpeoplemissingagewasuniformwithinthe0-17agegrouping.Otherpeoplewithunre-solvedageremainedintheagegrouptheyhadbeenorigi-nallyassignedto.HOUSEHOLDNONINTERVIEWADJUSTMENTTheA.C.E.householdnoninterviewadjustmentgenerallyspreadtheweightsofP-samplenoninterviewedhousingunitsoverinterviewedhousingunitsinthesameblock clusterwiththesamehousingunitstructuretype.Housing unitsweredeterminedtobenoninterviewsintwoways:1)aninterviewwasnotconductedduringtheA.C.E.per-soninterviewoperation,and2)basedontheresultsofthe A.C.E.PFU,itwasdeterminedthatawholehouseholdofP-Samplepeopleshouldnothavebeenlistedinthefirstplace,andthatanotherhouseholdmayhavebeenresi-dentsatthathousingunit.Separatehouseholdnoninter-viewadjustmentswereimplementedforCensusDayandA.C.E.InterviewDay.TheA.C.E.RevisionIInoninterviewadjustmentmethodol-ogyforA.C.E.InterviewDaywasessentiallyunchangedfromtheA.C.E.Therewas,however,animportantchangefromtheA.C.E.methodologyforthenoninterviewadjust-mentforCensusDayresidency.InA.C.E.RevisionII,anew imputationcellwasdefined.Itincludednewnoninter-viewsduetowholehouseholdsofA.C.E.nonmoverswhoweredeterminedtobeinmoversornonresidentoutmov-ersbytheRevisioncoding.ThenewnoninterviewcellSectionIIChapter44-1A.C.E.RevisionIIMissingDataMethodsU.S.CensusBureau,Census2000 spreadtheweightsofthesenoninterviewedunitsoverhousingunitswithatleastonepersonwho:1)indicated he/shelivedatanotheraddress,or2)wasidentifiedas potentiallyfictitiousintheA.C.E.Thesenewnoninter-viewswereassumedtohavebothalowmatchrateanda lowresidencyratesimilartothisgroup.Otherwise,the noninterviewadjustmentforCensusDayusedmethodol-ogysimilartothatoftheA.C.E.ASSIGNMENTOFPROBABILITIESOFCORRECTENUMERATION,CENSUSDAYRESIDENCY,ANDMATCHSTATUSIntheA.C.E.,P-samplepeoplewithunresolvedCensusDayresidencyormatchstatusoccurredinoneoftwoways.Firstly,theA.C.E.personinterviewmaynothaveprovided sufficientinformationformatchandfollow-up.Secondly,theA.C.E.PFUmaynothavecollectedadequateinforma-tionfordeterminingapersonsCensusDayresidencysta-tusortheirmatchstatus.InadequatedatacollectioncanalsoresultinunresolvedenumerationstatusesforA.C.E.E-samplepeople.IntheA.C.E.RevisionII,theEFUwasalso thesourceofunresolvedcases.Howacasewasimputeddependedonhowitbecameunresolved.ImputationforPeoplewithInsufficientInformationforMatchandFollow-UpTheRevisionP-samplepeoplewithinsufficientinformationformatchandfollow-uptendedtobethesamepeople whohadinsufficientinformationformatchandfollow-upintheA.C.E.,exceptforsomerarecaseswithcodingchanges.Notethatpeoplewhohadinsufficientinforma-tionintheA.C.E.werenotsenttoEFU.Therewereaboutthreemillionweightedpeoplewithinsufficientinforma-tionformatchandfollow-upinboththeFullandRevision Psamples.IntheA.C.E.,P-samplepeoplewithinsufficientinforma-tionformatchandfollow-upwereassignedaprobability ofCensusDayresidencyequaltotheresidencyrateofP-samplepeoplewhowenttoPFU.ThismethodologywasimprovedintheA.C.E.RevisionIIbydefiningfinerimputa-tioncellsthataccountedforwhetherornotthehousingunitwasmatched,nonmatched,orhadaconflictinghousehold.AconflictinghouseholdexistedwhentheP-andE-samplehouseholdshadnopeopleincommon.Theprobabilityofmatchwasassignedbasedontheover-allmatchrate,dividedintogroupsbasedonmoverstatus andhousingunitmatchstatus,aswasdoneintheA.C.E.,andadditionallyonconflictinghouseholdstatus.ImputationforPeoplewithIncompleteorAmbiguousFollow-UpIncontrasttoP-samplepeoplewithinsufficientinforma-tion,theresidencystatusforRevisionP-samplepeopleandthecorrectenumerationstatusforRevisionE-samplepeopleoftenchangedfromwhatitwasintheA.C.E.ThesestatuseschangedbecausetheRevisioncodingprocessed newinformationfromtheEFU,inadditiontotheoriginal informationfromthePFU.Thus,whiletheEFUinformation resolvedmanycasesthatwereunresolvedintheA.C.E.
becauseofthePFU,EFUcaseswithincompleteorambigu-ousinformationwereanewsourceofunresolvedcases.
TherewereaboutthesamenumberofweightedE-sample unresolvedcasesintheRevisionsampleasintheA.C.E.,
morethansixmillion,withabouthalfoftheserepresent-ingnewunresolvedcases.Incontrast,theRevisioncoding generatedsubstantiallymoreP-sampleunresolvedcases thantheA.C.E.,4.6millioncomparedto2.7million.This increasewasduetothefactthatallRevisionP-Sample cases(exceptthosewithinsufficientinformation)wentto EFU,includingwholehouseholdsofnonmatchedpeople whohadnotgonetoPFU.Thesepeoplewereassumedto beresolvedintheA.C.E.andcouldhavebecomeunre-solvedbecauseoftheEFU.TheoriginalA.C.E.missingdataplanbasedtheimputationcellsoninformationobtainedbeforeanyfollow-upwas conducted.AnadhocfixtotheA.C.E.missingdatameth-odologywasimplementedusinginformationfromthePFU.SeeCantwellandChilders(2001)fordetails.Based onthePFUkeyeddata,afterfollow-upgroupsforpoten-tialfictitiousandlivedelsewhereonCensusDaywerecreated.Thenewcellsusedinformationhighlyrelevantto residentorenumerationstatus.Further,theyshowedgreaterdiscriminationinassigningprobabilitiesofcorrectenumerationandresidency.InA.C.E.RevisionII,the beforefollow-upimputationcellswereabandonedandthe cellsweredefinedbasedonafterfollow-upinformation.ThischangewasthesinglemostimportantimprovementintheA.C.E.RevisionIImissingdatamethodology.Theafterfollow-upgroupdefinitionswerebasedonkeyedresponsestothePFUandEFUquestionnairecheckboxes andtheWhycodes.Whycodeswereclerically-appliedcodesthatreflectedresponsesinthequestionnairecheck-boxes,aswellashandwrittennotes.SeeAdamsand Krejsa(2002)foradetaileddescription.ThekeyedresultsandWhycodeshelpedidentifythefollowing:*unresolvedcaseswiththesamehistory,i.e.,therecipi-entcells.*resolvedfollow-upcaseswiththesamehistoryuptothepointofbeingunresolved,i.e.,thedonorpool.PFUafterfollow-upgroupsweredefinedforthosecasesthatwereunresolvedasaresultofthePFU.Similarly,EFUafterfollow-upgroupsweredefinedforthosecasesunresolvedbecauseoftheEFU.Itwasneces-sarytodefineseparategroupsforthePFUandEFU,becausetheirinterviewsandquestionnaireswerediffer-ent.However,thesameafterfollow-upgroupswere4-2SectionIIChapter4A.C.E.RevisionIIMissingDataMethodsU.S.CensusBureau,Census2000 employedfortheP-andE-sampleunresolvedcases,asthePFUandEFUquestionsaboutCensusDayresidencywere thesameasthePFUandEFUquestionsaboutenumeration
status.Itisusefultodistinguishbetweenuninformativeandinfor-mativeunresolvedcases:*uninformativeunresolved:thefollow-upwasanoninter-vieworanincompleteinterview,thoughtherewasnoevidenceofanerroneousenumerationornonresident.*informativeunresolved:afollow-upinterviewwascon-ducted,andtherewasevidenceofanerroneousenu-merationornonresident.Notethatwhenoneinterviewwasuninformativeunre-solved,buttheotherinterviewwasresolved,theRevisioncodingselected(i.e.,thecodewasbasedon)theresolvedinterview.Ontheotherhand,whentheunresolvedinter-viewwasinformative,theRevisioncodingcouldchoosetheunresolvedinterviewovertheresolvedone.SeeAdamsandKrejsa(2002)fordetailsoftheRevisioncod-
ing.ItoftenhappenedthatboththePFUandEFUinterviewswereunresolved.Toassignthiscasetoanimputationcell, theunresolvedinterviewthatwasmoreinformativewasselected.Whenbothinterviewshadthesamelevelofinformation,theEFUwastypicallyselectedoverthePFU, becausequestionsontheEFUquestionnaireweremore sharplydefined.Considerthefollowingexampleofanafterfollow-upgroup.OnecellofunresolvedE-samplepeopleorrecipi-entswasdefinedaspeoplewithevidencefromtheEFUinterviewthattheyhadmovedinsinceCensusDay,ormovedoutbeforeCensusDay,thoughtheEFUinterview didnotprovidetheaddresstheymovedtoorfrom.Itwasimpossibletodeterminetheenumerationstatusofthesepeople,sinceitwasuncleariftheirCensusDayaddress wasintheA.C.E.cluster.ThecorrespondingdonorpoolconsistedofthoseresolvedpeoplewhoindicatedintheEFUthattheyhadmovedinafterCensusDayormoved outbeforeCensusDay.Generally,thesepeopleprovided theirmoveraddressintheEFU.Ananalogousafterfollow-upgroupwasformedforpeopleunresolvedbecausetheyindicatedtheyweremoversinthePFUinterview.These groupsarecharacterizedasinformative,becausethefollow-upprovidedevidenceofanerroneousenumeration.Table4-1showsthenineEFUafterfollow-upgroups,whileTable4-2showstheninePFUafterfollow-upgroups.PeoplewhomovedinafterCensusDayormovedoutbeforeCensusDaywerethelargestinformativeafterfollow-upgroup.Anotherimportantinformativeafter follow-upgroupconsistedofpeoplewho,accordingtothe follow-up,hadanotherresidencesuchasavacationhome, thoughthefollow-updidnotindicatewhethertheother residenceorthesampleaddresswastheCensusDayresi-dence.Thenoninterviewgroupsanddidntanswerother residencequestionsgroupwerethelargeruninformative groups.Table4-1.EFUAfterFollow-upGroupsInformativegroupsThefolloweduppersonLivedelsewhereoratanotherresidence,buttheaddresswasnotgiven.FolloweduppersonmovedinafterCensusDayoroutbeforeCensusDay,butCensusDayaddressnotgiven.Respondentindicatedthefollowed-uppersonNeverlivedhereatthesampleaddress,butdidnotprovidetheCensusDayaddress.Thefollowed-uppersonhadanotherresidence,butdidnotindicatewhetherthesampleaddressorotherresidencewastheCensusDay
residence.Followeduppersonmovedinormovedout,butnomovedatesgiven.UninformativegroupsTherespondentindicatedthefolloweduppersonLivedhereatthesampleresidence,butdidnotanswertheotherresidencequestion.Therespondentansweredthecurrentresidencequestion,butdidnotanswerthegroupquartersandotherresidencequestions.Therespondentdidnotanswertheusualresidencequestion,northegroupquartersandotherresidencequestions.Potentiallyfictitiousperson,norespondentsknewofthefollowedup person.SomeofthelargerEFUgroupsweresubdividedbyA.C.E.operationalvariables,suchaswhetherornotthehouse-holdwenttoPFU,orwhetherthehouseholdwasconflict-ing.Theuninformativeafterfollow-upgroupstendedtohaveimputedprobabilitiesofcorrectenumerationorresi-denceclosetoone,typicallyintherangeof0.92to0.99.
Incontrast,theinformativeafterfollow-upgroupshadsmallerprobabilities,oftenlessthan0.25.Theprobabilityofcorrectenumerationiscalculatedastheweightedpro-portionofcorrectenumerationsinthedonorpool.For example,ProbabilityofcorrectenumerationWeightedCEsinDonorPoolWeightedResolvedEnumerationsinDonorPool
.ForthePsample,probabilitiesofresidencyandmatchsta-tuswerecalculatedanalogously.SectionIIChapter44-3A.C.E.RevisionIIMissingDataMethodsU.S.CensusBureau,Census2000 Table4-2.PFUAfterFollow-upGroupsInformativegroupsThefolloweduppersonLivedelsewhereoratanotherresidence,buttheaddresswasnotgiven.FolloweduppersonmovedinafterCensusDayoroutbeforeCensusDay,butCensusDayaddresswasnotgiven.Therespondentindicatedthefolloweduppersondidnotlivehereatthesampleaddress,butdidnotindicatetheotheraddressanddidnot answerthegroupquartersandotherresidencequestions.Thefolloweduppersonhadanotherresidence,butdidnotindicatewheretheusualresidencewas.UninformativegroupsTherespondentindicatedthefolloweduppersonLivedhereatthesampleresidence,butdidnotanswertheotherresidencequestion.Therespondentansweredtheusualresidencequestion,butdidnotanswerthegroupquartersandotherresidencequestions.ThelivedherequestionisDontKnow/refused,andthegroupquartersandotherresidencequestionswerenotanswered.Blankquestionnaire.Potentiallyfictitiousperson,norespondentsknewofthefollowedup person.ImputationforConflictingCodingCasesWhentheA.C.E.EFUandPFUinterviewshadcontradictoryinformation,theRevisioncodingprocedureassignedthe caseaconflictingcode.Notethataconflictingcodeisdif-ferentthanaconflictinghousehold.Allconflictingcasesin theRevisioncodingprocessweresenttoanalystsforcleri-calreview.Byexaminingthehandwrittennotesofinter-viewers,analystscouldoftendeterminewhichofthetwo interviewswasbetterandassigntheappropriatecode.
Thereweresomecaseswheretheinterviewsappearedto beofequalquality,suchaswhenbothrespondentswere householdmembersorbothrespondentswereproxiesof equalcaliber.Fortheseconflictingcases,theinterviews seemedequallylikelytobecorrectbasedontheanalysts expertise.Therefore,theprobabilityofcorrectenumera-tionforRevisionE-sampleconflictingcasesandtheprob-abilityofCensusDayresidencystatusforRevision P-sampleconflictingcaseswereassignedtobe0.5.It shouldbenotedthattheRevisioncodingresultedincon-siderablyfewerconflictingcasesthanthePFU/EFUReview Sample.AccordingtoAdamsandKrejsa(2001),the PFU/EFUReviewSamplehadabout2.6millionweighted conflictingpeopleincontrasttoonlyabout100,000 weightedconflictingpeopleintheRevisionSamples.4-4SectionIIChapter4A.C.E.RevisionIIMissingDataMethodsU.S.CensusBureau,Census2000 Chapter5.FurtherStudyofPersonDuplicationinCensus2000 INTRODUCTIONEvaluationsoftheMarch2001coverageestimatesindi-catedtheA.C.E.failedtodetectalargenumberoferrone-ouscensusenumerations.Onetypeofthesecensuserro-neousenumerationswasduplicatecensusenumerations; thatis,censusenumerationsincludedinthecensustwoormoretimes.TheA.C.E.wasnotspecificallydesignedtodetectduplicatecensusenumerationsbeyondthesearch area.However,therewasanexpectationthattheA.C.E.woulddetectthattheseE-sampleenumerationshadanotherresidence,andthat,roughlyhalfthetime thisotherresidencewastheusualresidence.Feldpausch(2001)showedthisexpectationwasnotmet.ForpurposesofconstructingA.C.E.RevisionIIestimates,matchingandmodelingtechniqueswereusedtoidentifyduplicatelinksbetweentheFullEandPsamplestocensus enumerations.Thematchingalgorithmusedstatistical matchingtoidentifylinkedrecords.Statisticalmatchingallowedforthematchingvariablesnottobeexactonbothrecordsbeingcompared.Becauselinkedrecordsmaynot refertothesameindividual,evenwhenthecharacteristicsusedtomatchtherecordsareidentical,modelingtech-niqueswereusedtoassignameasureofconfidence,the duplicateprobability,thatthetworecordsrefertothesameindividual.TheseduplicateprobabilitieswereusedintheA.C.E.RevisionIIestimates.Thischapterdocumentsthematchingandmodelingmeth-odsthatwereusedtoidentifyduplicatelinksandtopro-duceduplicateprobabilities.Notethatthisstudywasnotintendedtoidentifywhichenumerationwasinthecorrectlocation.Chapter6describeshowtocomputethecondi-tionalprobabilitythatthesamplecasewasinthecorrectlocationgiventhatithadalinktoacensusenumerationoutsidetheA.C.E.searcharea.Thiscalculationimpacts thecorrectenumerationstatusintheEsampleandtheresidencestatusinthePsample.AfulldiscussionoftheestimationcomponentsisgiveninChapter6.
BACKGROUNDMule(2001)reportedresultsforinitialattemptsatmeasur-ingtheextentofpersonduplicationinCensus2000.Thisworkwasconductedbyaninter-divisionalgroupaspartofthefurtherresearchtoinformtheESCAPIIdecisionon adjustingcensusdataproducts.ThisstudyisreferredtoastheESCAPIIduplicatestudyinthischapter.TheESCAPIIduplicatestudyusedconservativecomputermatchingrulestominimizethenumberoffalsematchesthatcouldbeintroducedwhendoinganationwidesearch,sincetherewasnoclericalreviewoftheresults.Asaconse-quenceofthematchingrules,comparisonstobenchmarksindicatedthattheESCAPIIduplicateestimateswerealowerbound.Specifically,comparingtheESCAPIIresults withintheA.C.E.sampleareatotheA.C.E.clericalmatch-ingresultsshowedthatonly37.8percentofthecensusduplicateswereidentified.Fay(2001,2002)estimatedthe matchingefficiencyat75.7percentwhenaccountingfor thecensusrecordsout-of-scopefortheA.C.E.duplicatesearch.Theout-of-scoperecordswerethosethatwerereinstatedanddeletedfromtheHousingUnitDuplication Operation,documentedinNash(2000).TheESCAPIImatchingwasatwo-stepprocess.First,thesampleofcensusrecordswerematchedtothefullcensus onfirstname,lastname,monthofbirth,dayofbirthandcomputedage.Agewasallowedtovarybyoneyear.Middleinitialsandsuffixesbeingscannedintothefirst namefieldwereaccountedfor;however,theothercharac-teristicshadtobeexactmatchesatthisstage.Thisfirst-stagematchestablishedalinkbetweenhouseholds.Inthesecondstage,allpersonrecordsinthelinkedhouseholds fromthefirststagewerestatisticallymatchedusingfirstname,middleinitial,lastname,monthofbirth,dayofbirth,andcomputedage.Thematchingparametersused inthestatisticalmatchingwereborrowedfromotherCen-sus2000matchingoperations.Mule(2001)describesthismatchingalgorithminmoredetail.Toreducetheimpactoffalsematches,particularlywithrespecttopersonswithcommonnamesandthesamemonthanddayofbirth,modelweightswereappliedtoeachsetoflinkedrecordsasameasureofconfidencethat thelinkedrecordswereindeedduplicates.Duetosched-uleconstraints,anational,Poissonmodelwasusedinlieuofaprobabilitymodel.TheESCAPIIcensusduplicatemethodologysatisfiedtheintendedprojectgoalsandprovidedavaluableevaluationofthecensusbyshowingthatpersonduplicationexisted.However,limitationsofthemethodologymadeitdifficult toestimatethemagnitudeofpersonduplicationinthe census.OVERVIEWOFTHEDUPLICATESTUDYPLANLiketheESCAPIIstudy,theA.C.E.RevisionIIduplicateplaninvolvedmatchingtheFullEandPsamplestothecensustoestablishpotentialduplicatelinks.Then,model-ingtechniqueswereusedtoidentifythelinksmostlikelySectionIIChapter55-1FurtherStudyofPersonDuplicationinCensus2000U.S.CensusBureau,Census2000 tobeduplicateenumerationsandtoassignameasureofconfidencethatthelinksareduplicates.Keydifferences withtheESCAPIIstudyincludeextendingtheuseofsta-tisticalmatchinganddevelopingmodelstoassignadupli-cateprobabilitytothelinks.Anadvantageofduplicate probabilitiesoverthePoissonmodelweightsusedinthe ESCAPIIstudyisthatallduplicatelinksoutsidetheA.C.E.
searchareacouldbereflectedintheA.C.E.RevisionIIesti-mates.Fay(2001,2002)usedasubsetoftheESCAPII duplicatelinkstoproducealowerboundonthelevelof erroneousenumerationsthattheA.C.E.didnotmeasure.EstimatesofcensusduplicationwerebasedonmatchingandmodelingE-samplecasestothecensus.Forpurposes ofA.C.E.RevisionIIestimation,thePsamplewasalsomatchedtothecensus.However,theseresultsdidnotcontributetoestimatesofpersonduplicationinthecen-sus.TheA.C.E.RevisionIIestimationmethodologyadjustedtheA.C.E.correctenumerationrateforE-sample caseswithlinksoutsidetheA.C.E.searcharea.Further,theA.C.E.RevisionIIestimationmethodologyadjustedtheA.C.E.matchrateforP-samplecasesthatlinkedtocensuscasesoutsidethesearcharea.Thematchingalgorithmconsistedoftwostages.ThefirststagewasanationalmatchofpersonsusingstatisticalmatchingasdescribedinWinkler(1995).Statisticalmatch-ingattemptedtolinkrecordsbasedonsimilarcharacteris-ticsorcloseagreementofcharacteristics.Exactmatchingrequiredexactagreementofcharacteristics.Statistical matchingallowedtworecordstolinkinthepresenceofmissingdataandtypographicalorscanningerrors.Sixcharacteristicscommontobothfiles,calledmatchingvariables,wereusedtolinkrecordsintheFullEandPsamplewithrecordsinthecensus.Matchingparametersassociatedwitheachmatchingvariablewereusedtomea-surethedegreetowhichthematchingvariablesagreedbetweenthetworecords,rangingfromfullagreementto fulldisagreement.Themeasurementofthedegreetowhicheachmatchingvariableagreedwascalledthevari-ablematchscore.Theoverallmatchscoreforthelinkedrecordswasthesumofthevariablematchscores.Fullagreementofatleastfourcharacteristicswasrequiredtobeconsideredaduplicatelink.Becausethisstudywasacomputerprocesswithoutthebenefitofaclericalreview,thislimitationofthestatisticalmatchingwasnecessaryin ordertominimizelinkingrecordswithsimilarcharacteris-ticsthatrepresenteddifferentpeople.Thiswasaparticu-larconcernwhenlookingforduplicateenumerations acrosstheentirecountry.Theneedtousestatistical matchingatthefirststagewasapparentafterthelimitedsuccessoftheESCAPIIexactmatchingproceduretoiden-tifyA.C.E.duplicatesintheA.C.E.sampleareas.Thesta-tisticalmatchingyieldedbetteridentificationoftheA.C.E.duplicates,buttoidentifyalloftheA.C.E.duplicateswouldhaverequiredfewercharacteristicstobeexact matches.Thiscouldpotentiallyleadtoahighnumberof falselinks.ThesearchforduplicatelinksbetweentheFullEandPsamplesandthecensuswaslimitedtothosepairsthatagreedoncertainidentifiers,orblockingcriteria.Blockingcriteriaweresortkeysthatwereusedtoincreasethecom-puterprocessingefficiencybysearchingforlinkswheretheyweremostlikelytobefound.Forinstance,tosearchonlyforduplicateswhenthefirstandlastnamesagreed, boththesampleandcensusfileswouldhavebeensorted bytheblockingcriteriaoffirstandlastname.Then,allpossiblepairswithineachfirstname/lastnamecombina-tionwouldhavebeensearchedforduplicatelinks.
Althoughtruematchescanbemissedbyusingblockingcriteria,multiplesetsofblockingcriteriaminimizethenumberofmissedmatches.TheA.C.E.RevisionIIdupli-catestudyutilizedfoursetsofblockingcriteria.Atthefirststageofmatching,itwaspossibleforonesamplecasetolinktomultiplecensusrecords.Allofthese linkswereretainedforthesecondstageofmatching.Thesecondstageofmatchingwaslimitedtomatchingpersonswithinhouseholds.IfanE-orP-samplecase linkedtoacensusrecordinagroupquarter,thecasedidnotgotothesecondstage.Usingresultsfromthefirststageofmatching,alinkbetweentwohousingunitswas established.Thesecondstagewasastatisticalmatchofallhouseholdmembersinthesamplehousingunittoallhouseholdmembersinthecensushousingunit.The second-stagematchingvariableswerethesameasthe firststage;however,thematchingparametersdiffered.Usingasubsetofthefirst-stagelinks,thesecond-stagematchingparameterswerederivedusingtheExpectation-Maximization(EM)algorithm.SeeWinkler(1995)foramoredetailedexplanation.Akeydifferencebetweenthefirst-andsecond-stageparameterswasthereduced emphasisonrequiringlastnamestoagreeinthesecondstage.Thisintuitivelymakessense,sincesecondstagematchingwaswithinagivenhousehold.Thehouseholdwastheonlysetofblockingcriteriausedatthesecondstageofmatching.Samplerecordswereallowedtolinktoonlyonecensusrecordwithinthe household.Asaconsequence,thislimitedtheabilityto identifywithin-householdduplicatelinks.Eachlinkhadanoverallmatchscorebasedonthesecond-stagematching.Thesetoflinkedrecordsfromthesecond-stagematchingandthelinkstogroupquarterenumerationsfromthefirststageconsistedofbothduplicateenumerationsandper-sonrecordswithcommoncharacteristics.Twomodeling approacheswereusedtoestimatetheprobabilitythatthelinkedrecordswereduplicates.Oneapproachusedtheresultsofthestatisticalmatchingandreliedonthe strengthofmultiplelinkswithinthehouseholdtoindicate5-2SectionIIChapter5FurtherStudyofPersonDuplicationinCensus2000U.S.CensusBureau,Census2000 personduplication.Thesecondreliedonanexactmatchofthecensustoitselfandthedistributionofbirths, namesandpopulationsizetoindicateiftheindividuallink wasaduplicate.Thesetwoapproacheswerereferredtoas thestatisticalmatchmodelingandtheexactmatchmodel-ing,respectively.Thesetwoapproacheswerecombinedso thateachsamplecasewithalinktoacensusenumerationhadanestimatedprobabilityofbeingaduplicate.Thestatisticalmatchmodelingwasusedwhentwoormoreduplicatelinkswerefoundbetweenhousingunitsin thesecondstage.Afterthesecond-stagematching,eachduplicatelinkbetweenasamplehouseholdandcensushouseholdhadanoverallmatchscore.So,foreachsample household,asetofmatchscoreswasobserved.Foranyresultingsetofmatchscores,aprobabilityofnotobserv-ingthissetofmatchscoreswasestimated.Seetheattach-mentfordetails.Thehigherthisprobability,themorelikelythatthesetoflinkedrecordsinthehouseholdwere duplicates.Theestimateoftheprobabilityofnotobservingthissetofmatchscoresassumedindependenceoftheindividualmatchscoreswithineachhousehold.Thisassumptionwas basedonusingtheEMalgorithmtodeterminethesecond-stagematchingparameters.Theprobabilityofobservingtheindividualmatchscoreswasestimatedfromtheempiricaldistributionofindividualmatchscoresresulting fromthesecond-stagematching.Further,thismeasureaccountedforthenumberoftimesthatauniquesamplehouseholdwasmatchedtodifferentcensushouseholds withinagivenlevelofgeography.Theprobabilityofnot observingthissetofmatchscoreswastranslatedintoastatisticalmatchduplicateprobabilityof0or1basedoncriticalvaluesthatvariedbylevelofgeography.Theexactmatchmodelingreliedonanexactmatchofthecensustoitself.Themethodologyaccountedfortheover-alldistributionofbirths,frequencyofnames,andpopula-tionsizeinaspecificgeographicarea.Duplicateprobabili-tieswerecomputedseparatelybygeographicaldistanceofthelinks.Further,duplicatelinksweremodeledseparately byhowcommonthelastnamewas,aswellasforHis-panicnames.Thetwoapproacheswerecombinedtoassignanesti-matedprobabilitythatthelinkedrecordswereduplicates.
Theduplicateprobabilityforthelinkstogroupquartersinthefirststageandone-personhouseholdlinkswerefromtheexactmatchmodeling.Forallotherlinks,thedupli-cateprobabilitywasthelargerofthetwomodelesti-mates.Fornonexactmatches,thiswasalwaysfromthestatisticalmatchmodeling.Forexactmatches,adjust-mentsweremadetoaccountfortheintegrationofthese twomethods.Basedontheresultsofthismatchingandmodeling,anoverallestimateofcensusduplicateswasderivedfromthe E-samplelinks.Further,foreachFullE-andP-sampleper-sonwholinkedoutsidetheA.C.E.searcharea,these resultsprovidedtheprobabilitythattheywereinfactthe sameperson.Theseduplicateprobabilitieswereusedin theA.C.E.RevisionIIestimates.MATCHINGALGORITHMEffortstoincreasematchingefficiencyovertheESCAPIIduplicatestudyincludedimplementingstatisticalmatch-ingofpersonsatthefirststageandtheuseofmoredis-criminatingmatchingparametersatthesecondstage.
InputsBoththeFullEandPsampleswerematchedtothecensusrecords.TheE-samplerecordsreflectedanyupdatesmadebytheclericalstaffduringtheA.C.E.matchingoperation whenthecensuscharacteristicswereincorrectlytran-scribedorscanned.ThePsampleincludedallnonmovers,outmovers,andinmovers.ThesamematchingalgorithmwasusedfortheFullEandPsamples.Thecensusfilesconsistedofdata-definedpersonrecordsforboththehouseholdandgroupquarterspopulations.
BoththereinstatedanddeletedrecordsfromtheHousingUnitDuplicationOperationdescribedinNash(2000)wereincludedinthematching,sotheselinkscouldbereflected intheA.C.E.RevisionIIestimates.FirstStage:Person-LevelMatchingThefirststagewasastatisticalmatchoftheFullEandPsamplestothecensus.Thiswasanationalmatchwhere eachFullsamplecasewascomparedwithcensusrecordsacrossthenationtoassesshowwellthematchingvari-ablesagreed.Thematchingvariableswerefirstname,lastname,middleinitial,monthofbirth,dayofbirth,andcomputedage.ThematchingvariablesandparametersaregiveninTable5-1.Theagreementweightandthedisagreementweight arethematchingparametersofeachvariable.Standardmatchingparameterswereusedatthefirststage.Therelationshipoftheagreementanddisagreementparam-eterstranslatedintothematchscoreforeachvariable.For example,thefullagreementvalueforfirstnamewas2.1972;whereas,thefulldisagreementmatchscorewas-2.1972.Thesumofthevariablematchscoreswasthe totalmatchscore.Whenthematchscorewas9.4006,thisindicatedfullagreementofallvariables.Amatchscoreof-9.4006,ontheotherhand,indicatedfulldisagreement.SectionIIChapter55-3FurtherStudyofPersonDuplicationinCensus2000U.S.CensusBureau,Census2000 Table5-1.First-StageMatchingParametersMatchingvariablesTypeof comparisonMatchingparametersMatchscore Agreementweight(m)Disagreementweight(u)Agreementln(m/u)Disagreementln((1-m)/(1-u))FirstnameString(uo)0.90.12.1972-2.1972LastnameString(uo)0.90.12.1972-2.1972MiddleinitialExact0.70.30.8473-0.8473MonthofbirthExact0.80.21.3863-1.3863DayofbirthExact0.80.21.3863-1.3863ComputedageAge(p)0.80.21.3863-1.3863Total9.4006-9.4006Thetypeofcomparisonindicatedthestatisticalmatchingmethodforcomparingthevariables.Forexample,thestringcomparitorwasusedforfirstnameandlastname.Thismethodaddressedtypographicalerrorsinnames.For example,TimandTumcanyieldapositiveagreementscore.Anexactmatchalgorithmwouldhavetreatedtheseasadisagreement.Forage,theagevaluescouldhave beenoffby+/-oneyearandstillreceiveafullagreementscoreoncomputedage.TheStatisticalResearchDivisionmatchingsoftwarecalledBigMatchdocumentedinYancey(2002)wasusedinthe firststage.Thissoftwareallowedasamplerecordtolink tomorethanonecensusrecord.Thiscapabilitywasimportant,sinceitwaspossiblefortheretobemorethantwoenumerationsofthesamepersoninthecensus.Fourblockingcriteriawereused.Blockingrestrictedthecomparisonsofrecordstoonlythosethatexactlyagreedoncertainvalues.Mostrecordsthatdidnotagreeonthevaluesbelowareprobablynotduplicates.Theblocking criteriawere:*Firstname,lastname*Firstname,firstinitialoflastname,agegroupings(0-9,10-19,20-29,etc.)*Lastname,firstinitialoffirstname,agegroupings(0-9,10-19,20-29,etc.)*Firstinitialoffirstname,firstinitialoflastname,monthofbirth,dayofbirthAllpossiblelinkswithineachblockingcriteriawerecom-pared.Foreachcomparison,thevariablematchscoreandthetotalmatchscorewerecomputed.Thefirst-stagematchingdecisionruleswereasfollows.First,amatch musthavehadatleastfourofthematchvariablesinfull agreement.ThismeantthatfourofthevariableshadtohaveamatchscoreequaltotheagreementmatchscoreinTable5-1.Theoneexceptionwasthemiddleinitial.When themiddleinitialwasblank,itwasconsideredtobeinfullagreementinthisstudysincethemiddleinitialwasoftenmissingonthesampleandcensusrecords.Inthiscase,themiddleinitialscorewaszero.Second,thetotalmatchscorehadtobe4.7orgreater.Thisminimumscorewasabouthalfthetotalscoreforfullagreementofallmatch-ingvariables.Table5-2showsthedistributionofA.C.E.linkswithinclusterthatwereidentifiedbytheresultingnumberofmatchingvariablesinfullagreement.Therewereatotalof 10,559duplicatelinksidentifiedbytheA.C.E.clericalstaffthatagreedonthefirstletterofthefirstandlastname.ThetableshowsthenumberofidentifiedA.C.E.duplicates asthenumberofmatchingvariablesinfullagreement decreased.Thetablealsodisplaysthenumberoftotallinksthatwereidentified.ThepercentofA.C.E.linksineachrowofthetabledecreasesasthenumberofmatch-ingvariablesinfullagreementdecreases.Byrequiringatleastfourmatchingvariablestobeinfullagreement,68.4percentoftheseA.C.E.duplicateswere identified.Ontheotherhand,whenonlyfourofthesixvariablesfullyagreed,only30.4percentofthetotallinksidentifiedbythiscriteriawereA.C.E.RevisionIIduplicates.
Notethatitwastemptingtorequirethatonlythreevari-ablesbeinfullagreement,sincethiswouldincreasethenumberofA.C.E.duplicatesby20percent.However,thischangewouldsubstantiallyincreasethenumberoffalse
matches.Table5-3showsthatintroducingaminimumtotalscoregreatlyincreasedthedensityofA.C.E.linksidentified.
NotethatsomeA.C.E.duplicatelinksweredroppedbyusingthiscriteria.Thiswasaconsequenceofapplyingrulesthatreducedthefalselinkrate.SecondStage:Household-LevelMatchingThesecondstageofmatchingwasrestrictedtothehouse-holdpopulation.Thepersonlinksfromthefirststageestablishedalinkbetweentwohousingunits.Thesecondstagewasastatisticalmatchofthehouseholdmembers fromthetwohousingunits.Asamplehouseholdwasincludedinthesecondstagemultipletimes,ifthesample-householdhadpersonswithlinkstomultiplecensus householdsinthefirststage.Thiswasthesameapproach usedfortheESCAPIIduplicatestudy.5-4SectionIIChapter5FurtherStudyofPersonDuplicationinCensus2000U.S.CensusBureau,Census2000 Table5-2.DistributionofLinksWithinA.C.E.ClustersbyFullAgreement[Percentagesmaynotaddduetorounding]NumberofvariablesinfullagreementA.C.E.linksTotal linksPercentofA.C.E.linksinrowCountPercentCumulativepercent62,34822.222.22,45195.852,89527.449.73,98372.7 41,98318.868.46,52030.432,21120.989.440,8915.429549.098.4180,3240.5 11641.699.9601,370<0.1 04<0.1100350,987<0.1Total10,5591001001,186,5260.9Table5-3.DistributionofA.C.E.andTotalLinksWithinA.C.E.Clusters[Onlyincludelinkswithscore4.7]NumberofvariablesinfullagreementA.C.E.linksTotallinksPercentofA.C.E.linksinrow62,3482,45195.852,8683,76376.2 41,6802,67062.9300n/a200n/a100n/a 000n/aTotal6,8968,88477.6Table5-4.Second-StageMatchingParametersMatchingvariablesTypeof comparisonMatchingparametersMatchscore Agreementweight(m)Disagreementweight(u)Agreementln(m/u)Disagreementln((1-m)/(1-u))FirstnameString(uo)0.95000.01254.3307-2.9832LastnameString(uo)0.96000.57000.5213-2.3749MiddleinitialExact0.08400.02201.3398-0.0655MonthofbirthExact0.60000.06002.3026-0.8544DayofbirthExact0.30000.02002.7081-0.3365ComputedageAge(p)0.97500.13251.9959-3.5467Total13.1984-4.1948Thematchingvariableswerethesameasthefirststage:firstname,lastname,middleinitial,monthofbirth,dayofbirth,andage.Table5-4givesthematchingparameters.Thedatainthistablehavesimilarmeaningasthatforthe firststageparametersinTable5-1.Usingasubsetofthefirst-stagelinks,thesecond-stagematchingparameterswerederivedusingtheEMalgorithmasdescribedinWin-kler(1995).Theseparameterswereanticipatedtobemore discriminatingthanthesetusedfortheESCAPIIstudy.Sincethefirst-stagematchingestablishedalinkbetweentwohousingunits,firstnamehadmorediscriminating powerthanlastnameinthesecondstage.Whenfirstnamefullyagreed,itcontributed4.3307towardthetotalscore,whilelastnameonlycontributed0.5213whenitwasinfullagreement.Further,monthofbirthanddayofbirthweremorepowerfulthancomputedage.Thiswas expectedsinceadultsinahousingunitoftenhavesimilarages,butnotthesamemonthanddayofbirth.TheStatisticalResearchDivisionRecordLinkagesoftwaredescribedinWinkler(1999)wasusedforthesecondstage.Eachsamplerecordwaslinkedtoonlyonecensusrecordwithinthehousehold,aone-to-onematching.There wasnoadditionalblockingcriteriabeyondhousehold;allpossiblelinkswithinhouseholdswerecompared.Eachlinkhadatotalmatchscorerangingfrom-4.1948to13.1984.
Thissecond-stagematchscorewasusedforthemodeling.SectionIIChapter55-5FurtherStudyofPersonDuplicationinCensus2000U.S.CensusBureau,Census2000 Alllinkswithasecond-stagematchscoregreaterthan0.3419wereretainedasinputtothemodeling.ReverseNameMatchingOccasionally,firstandlastnamewascapturedinreverseorderonthedatafiles.Thefirstnamewasinthelastnamefieldandthelastnamewasinthefirstnamefield.
Whenthedatawasinreverse-orderononefilebutnottheother,itwasdifficulttoidentifytheseduplicatelinkssincethevariablematchscoresforfirstandlastnamedisagreed forboththefirstandsecondstage.Toattempttoidentify thesecases,thefirstandlastnamefieldswerereversedandthenmatchedtothecensusfilesasecondtime.Theduplicatelinksfrombothruns,nameintheusualorder andinreverseorder,wereinputtothemodeling.Whenbothmethodsidentifiedthesameduplicatelink,thehigherofthetwomatchscoreswasretainedandusedin themodeling.MODELINGLINKSSincethegoalofthisstudywastoprovideduplicateinfor-mationforcalculatingA.C.E.RevisionIIestimates,itwas importanttoprovideameasureofconfidencethatcould beincorporatedintotheestimationmethodology.Conse-quently,modelingeffortsfocusedonmethodsforestimat-ingtheprobabilitythattwolinkedrecordswereduplicate enumerations.AnadvantageofduplicateprobabilitiesoverthePoissonmodelweightsusedinESCAPIIwasthatallduplicatelinksoutsidetheA.C.E.searchareacouldbe reflectedintheA.C.E.RevisionIIestimates.Thestatisticalandexactmatchmodelingapproacheswerecombinedtoyieldanestimatedduplicateprobabilityforthelinked recordsfromthestatisticalmatchingoftheEandP samplestothecensus.StatisticalMatchProbabilityThestatisticalmatchmodelingwasusedwhenthesecondstagematchingresultedintwoormoreduplicatelinks.Afterthesecond-stagematching,eachduplicatelink betweenasamplehouseholdandcensushouseholdhadanoverallmatchscore.So,foreachsamplehousingunittocensushousingunitmatch,asetofmatchscoreswas observed.Foranyresultingsetofmatchscores,aprob-abilityofnotobservingthissetofmatchscores,Pr(NT),wasestimatedforeachlinkwithinthesamplehousehold.
Thehigherthisprobability,themorelikelythatthesetof linkedrecordsinthehouseholdwereduplicates.Sinceasamplehousingunitcouldhavebeenmatchedtomorethanonecensushousingunitduringthesecond stage,thereweremultiplesetsofduplicatelinksand matchscoresforeachsamplehousingunit.EachsetofduplicatelinksforasamplehousingunitwasassignedaseparatePr(NT)sincethematchscoresdifferforeachmatchingattempt.Further,thePr(NT)foreachsetofdupli-catelinksforasamplehousingunitvariedbecauseofthe geographicdistanceoftheduplicatelinks.Asshownin theattachment,Pr(NT)wasestimatedby P rNT[1pd1 P rX dx d]nwhere Pr(X dx d)wastheprobabilityofgettingatotalmatchscoreX dgreaterthanorequaltox d , pwasthenumberofduplicatelinksinthesamplehousehold, and nisthenumberofcensushousingunitsthesamplehouseholdwasmatchedwithinthesecondstagewithinagivengeo-graphicarea.Theestimateoftheprobabilityofnotobservingthissetofmatchscoresassumedindependenceoftheindividual matchscoreswithineachhousehold.ThisassumptionwasbasedonusingtheEMalgorithmtodeterminethesecond-stagematchingparameters.Theprobabilityofobserving theindividualmatchscoreswasestimatedfromtheempiricaldistributionofindividualmatchscoresresultingfromtheentiresecond-stagematching.Further,thismea-sureaccountedforthenumberoftimesthataunique samplehouseholdwasmatchedtodifferentcensushouse-holdswithinagivenlevelofgeography.Thegeographicallevelswereblock,tract,samecounty(outsidetract),same state(outsidecounty),anddifferentstate.FortheEsample,thisanalysiswasdoneattheE-samplehouseholdlevel.ForthePsample,ahouseholdconsisted ofanycombinationofnonmovers,outmovers,andinmov-ers.Toaccountforthis,theduplicatelinkswereanalyzedseparatelybymoverstatuswhenlookingatpatternsof matchscores.Theprobabilityofnotobservingthissetofmatchscoreswastranslatedintoastatisticalmatchduplicateprob-abilityof0or1basedoncriticalvaluesthatvariedby levelofgeography.Table5-5showstheminimumvalueofPr(NT)forassigningastatisticalmatchduplicateprobabil-ityof1forEandPsamples.Table5-5.MinimumValueforAssigningStatisticalMatchProbabilityGeographicdistanceoflinkedrecordsMinimumPr(NT)EsamplePsampleSameblock...........................0.000.25Sametract(differentblock).
............0.700.35Samecounty(differenttract)
............0.970.60Samestate(differentcounty)
...........0.970.60Differentstate..
.......................0.970.605-6SectionIIChapter5FurtherStudyofPersonDuplicationinCensus2000U.S.CensusBureau,Census2000 DuplicatelinkswithaPr(NT)greaterthanorequaltotheminimumvalueinTable5-5wereassignedastatistical matchduplicateprobabilityof1.Allotherlinkswere assignedastatisticalmatchduplicateprobabilityof0.ExactMatchProbabilityGivenexactmatchingofthecensustoitself,duplicateprobabilitieswereassignedtolinkedrecordsbytakingintoaccounttheoveralldistributionofbirths,frequencyof names,andpopulationsizeinaspecificgeographicarea.Duplicateprobabilitieswerecomputedseparatelybylinkswithincounty,linkswithinstateanddifferentcounty,and differentstates.Further,duplicatelinksweremodeledseparatelybyhowcommonthelastnamewas,aswellasforHispanicnames.Fay(2002b)givesthemodelandpre-liminaryresults.ThefollowingareexcerptsfromFay(2002b)togivethereaderageneralideaoftheapproach.LikethePoissonmodel,thenewapproachusesfrequen-ciesofoccurrencesofcombinationsoffirstandlastname.Theresultisanestimatedprobabilityofduplicationformostmatches,exceptformatchesoffrequentlyoccurring names,wheretheprobabilityofduplicationislowanddif-ficulttoestimatewithhighrelativeprecision.Thisworkresultsinaseriesofprobabilitymodels,withparametersthatcanbeestimatedstatisticallyfromobservedcensusdata.Acoremodelcharacterizesprob-abilitiesofduplication,tripleenumeration(apparentenu-merationofthesamepersonthreetimes),andotherforms ofmultipleenumerationwithinagivengeographicarea.Theothermodelsaccountforduplicationacrossdomain.Thefirstpartofthecoremodelexpressestheprobabilityofcoincidentallysharingabirthday.Asecondsetofexpressions,amodelforcensusduplication,isbuiltontopofthemodelforcoincidentalsharingofdateofbirth.
Thecoremodelcombinesthetwomodelstoaccountforobservedpatternsofexactcomputermatchesofcensusenumerations.Thecoremodelprovidesabasistoesti-mateaprobabilitythatagivencomputermatchlinksthesamepersoninsteadoftwopersonscoincidentallysharingabirthday.Anapproximateargumentallowsthecore modeltobeextendedtonestedgeographiccategories, suchas(1)counties,(2)othercountieswithinstate,and(3)otherstates.Theresultoftheexactmatchmodelisaduplicateprob-abilitygreaterthanorequaltozero,butlessthanoneforcensusrecordsthatagreeexactlyonfirstname,lastname,monthanddayofbirthandtwo-yearageintervals.CombiningtheTwoModelsThetwoapproacheswerecombinedtogiveoneduplicateprobabilitytoeachE-andP-sampleduplicatelink.Table 5-6summarizestheresultsofcombiningthetwomodels.Theduplicateprobabilityforthelinkstogroupquartersinthefirststageandone-personhouseholdlinkswerefromtheexactmatchmodeling.Forallotherlinks,thedupli-cateprobabilitywasthelargerofthetwomodelesti-mates,asindicatedbytheshadedcellsinTable5-6.For nonexactmatches,theduplicateprobabilityassignment wasalwaysbasedonthestatisticalmatchmodeling.Forexactmatchesinsamplehouseholdswithtwoormorepersons,adjustmentsweremadetoaccountfortheinte-grationofthesetwomethods.Theexactmatchprobabili-tiesweredeterminedconditionally,requiringadownwardadjustmentoftheexactprobabilitiesforthelinks,which thestatisticalmatchmodelingassignedaprobabilityofzero.Theamountofthedownwardadjustmentwasbasedontheupwardadjustmentmadewhenusingthestatisticalmatchprobabilityofoneinsteadoftheexactmatchprob-ability.Table5-6.CombiningtheTwoModeling ResultsTypeofLinkSizeof sample HUTypeof match Statistical match probability Exact match probabilityHousingUnit1Exact-[0,1)Nonexact--2+Exact1[0,1)Exact0[0,1)Nonexact1-Nonexact0-GroupQuarterExact-[0,1)Nonexact---Modelingdidnotassignavalue.Theresultsofthismodelingprovided,foreachFullE-andP-samplepersonwholinkedtoacensuspersonoutside theA.C.E.searcharea,theprobabilitythattheywereinfactthesameperson.Theseprobabilities,referredtoasp tinChapter6,wereusedtoobtainA.C.E.RevisionIIesti-mates.ReinstatedandDeletedCensusRecordsFortheexactmatchmodeling,separateprobabilitieswerecomputedbasedonpopulationdistributionswithand withoutthereinstatedanddeletedrecords.Onecomputedduplicateprobabilityallowedsamplerecordstolinktoreinstatedanddeletedcensusrecords,whileasecond duplicateprobabilitydidnotallowlinkstoreinstatedand deletedrecords.Underthissecondscenario,anylinkstoreinstatedordeletedrecordswereassignedaduplicateprobabilityofzero.Theduplicateprobabilitiesusedinthe A.C.E.RevisionIIestimationwerethosethatallowedlinkstoreinstatedanddeletedcensusrecords.ASSESSMENTOFLINKSThroughoutthedevelopmentoftheFurtherStudyofPer-sonDuplicationinCensus2000,theA.C.E.duplicatelinksfoundduringproductionwerethebenchmarkusedtoSectionIIChapter55-7FurtherStudyofPersonDuplicationinCensus2000U.S.CensusBureau,Census2000 gaugewhetherthematchingalgorithmdidagoodjoboffindingtrueduplicatesandminimizingthenumberof falselinksfoundwithintheblockcluster.ThisstudyandtheESCAPIIduplicatestudydocumentedinFay(2001,2002)utilizedthesamemethodforestimating efficiencyfortheEsample.Basically,themethodesti-matedtheeffectivenessofidentifyingA.C.E.clericaldupli-cateswithintheA.C.E.sampleareaandaccountedfor duplicatelinkstoreinstatedanddeletedrecordsthatwereout-of-scopeforA.C.E.Insteadofproducingoneoverallefficiencymeasure,severalmeasureswerecomputedfor variouslevelsofdetailincludingsizeofsamplehouseholdandnumberoflinksbetweentheunits.FORMINGESTIMATESOFDUPLICATESEstimatesofcensusduplicateswereformedbysummingtheproductofthesamplingweightfortheE-sampleper-son,theduplicateprobability,andthemultiplicityfactor.
Sinceasampleofthecensus(Esample)wasmatchedto thecensus,anaiveapproachwouldtreateachduplicate linkofAtoBasoneduplicate.However,hadadifferent samplebeendrawn,itcouldhavecontainedtheBtoA link.Applyingamultiplicityfactorof1/2inthissimplecaseensuredthattheestimateofthisexamplewasonly oneduplicate.SeeMule(2002b)formoredetailsonthecomputationofthemultiplicityfactor.5-8SectionIIChapter5FurtherStudyofPersonDuplicationinCensus2000U.S.CensusBureau,Census2000 Attachment.ProbabilityofNotObservingaSetofMatchScoresEachE-samplehouseholdhadasetofduplicatelinkstoaparticularcensushousehold.Eachduplicatelinkhadacor-respondingoverallmatchscorefromthesecond-stagematchingresultinginapatternofmatchscoresforthe samplehousehold.Thetaskwastoassesswhetherthisobservedsetofmatchscoresoccurredbecausethelinkswereduplicatesorbecausetherecordshadcharacteristics incommonbutweredifferentpeople.Objective:Toestimatetheprobabilityofnotobservingthissetofmatchscoresorbetterforeach E-samplehousehold.Thehypothesisisthatthehighertheprobabilityofnotobservingthissetofmatchscoresorbetter,themore likelythelinksrepresentduplicateenumerations.SupposeaparticularE-samplehouseholdhasp2dupli-catelinkswithobservedmatchscoresx 1 ,x 2,...,x p.DefinePr(NT)tobetheprobabilityofnotobservingthesetofmatchscoresorbetter, (X 1x 1 , X 2x 2,..., X px p.Thisprobabilitycanbeexpressedas P rNT[1P rX 1x 1 ,X 2x 2,...., X px p]n (1)wherenwasthenumberofdifferentcensushousingunitsthattheE-samplehousingunitwaslinkedtoduringthesecond-stagematch.ThiscalculationaccountedforthefactthatthemoretimestheE-samplehousingunitmatchedtodifferenthousingunits,thegreaterthechance ofobtainingthisoutcome.IndividualmatchscoresX 1 ,X 2,...,X pwereassumedtobeindependent,sincethesecond-stagematchingparametersgavemoreemphasistofirstnameratherthanlastname.Further,theparametersgavemoreemphasistomonthand dayofbirthratherthanage.Undertheindependence assumption,(1)canbewrittenasfollows:
P rNT[1pd1 P rX dx d]n (2)TheprobabilityofobservingamatchscoreX dgreaterthanorequaltox d,Pr(X dx d),wasobtainedfromtheempiri-caldistributionofsecond-stagematchscores.Theprob-abilityin(2)wasusedfortheP-samplehouseholdsas well.SectionIIChapter55-9FurtherStudyofPersonDuplicationinCensus2000U.S.CensusBureau,Census2000 Chapter6.A.C.E.RevisionIIEstimationTheA.C.E.RevisionIIDualSystemEstimate(DSE)method-ologywasdevelopedwiththefollowingobjectivesin mind:*Integrationofthecorrectionsformeasurementerrorssothatmeasurementerrorsidentifiedbyboththeevalua-tionsandtheduplicatestudyarenotover-corrected.*SeparateestimationforbothE-andP-samplepersonsbasedonwhetherornottheylinkedtoacensusenu-merationoutsidethesearcharea.*Flexibilityinthepost-stratificationdesign,becausethefactorsthataffectcorrectenumeration(asmeasuredbytheEsample)werenotnecessarilythesameasthosethataffectcoverage(asmeasuredbythePsample).*Adjustmentforcorrelationbias.ThischapterpresentshowthisadditionalinformationwasincorporatedintotheDSEforA.C.E.RevisionIIestimates.
ThereaderisassumedtobefamiliarwiththebasicdualsystemmodelandhowitwasusedtoproducetheMarch2001A.C.E.estimates.SeeHaines(2001)foradetailed descriptionofthismethodologyandDavis(2001)fortheoriginaldualsystemestimationresults.ThischapterdescribestheapproachtoA.C.E.RevisionIIdualsystem estimation.Thechapterdiscussesestimationoftheterm accountingforpersonsinthecensuswhoarenotintheEsample.ThecorrectenumerationratefromtheE-sampledataisdescribed.Then,theestimationofthematchrate fromtheP-sampledataisaddressed.Thecensus,E-sample,andP-sampledataarecombinedtoformasingleDSEformula.Next,thepost-stratificationvariables usedfortheA.C.E.RevisionIIFullandRevisionSamples aredefined.Thechapterthendiscussesadjustmentforcorrelationbiasusingdemographicanalysissexratiosandconcludeswithadiscussionofsyntheticestimation.DUALSYSTEMESTIMATIONThebasicformofthedualsystemestimate(DSE)is:
DSEC en'II'CE EP M (1)where Cen'=thecensuscount,excludeslateadds II=non-data-definedcensusrecords,excludeslateadds LA=lateadditionstothecensus,i.e.recordsincludedtoolateforA.C.E.processing; primarilyreinstatedcasesfromthehousingunitduplicationoperation CE=E-sampleweightedestimateofcensuscorrect enumerations E=E-sampleweightedestimateofcensustotalenumerations(includesinsufficient informationformatchingandfollowupcases, excludesnon datadefinedcasesandlateadds)
P=P-sampleweightedestimateoftotalpersons M=P-sampleweightedestimateofmatchestocensuspersonsDSEswerecomputedseparatelywithinpost-strata.Apost-stratumisagroupofpeopledefinedbydemographicandgeographiccharacteristicswhoareassumedtohavethesameprobabilitiesofinclusioninthecensus.Post-strata canalsobedefinedtohaveequalprobabilitiesofcorrect enumerationinthecensus.TheDSEin(1)canbewrittenasafunctionofthefinalcen-suscount, Cen,whichincludeslateaddsandthefollowingthreerates:
DSEC enr DDr CE r M (2)where r DD(C en'II')/C enisthecensusdata-definedrate.Thenumeratorexcludes lateadds,butthedenominator includeslateadds.
r CE=CE/EistheE-samplecorrectenumerationrate r MM/PistheP-samplematchrate.Thethreeratescanbeinterpretedasestimatesofprob-abilities.Thus,withinpost-stratum,*r DDestimatestheprobabilitythatacensuspersonrecordhassufficient(andtimely)informationforinclusioninA.C.E.processing,*r CEestimatestheprobabilitythatanE-sampleuni-versepersonisacorrectenumeration,and
- r MestimatestheprobabilitythatapersoninthePsampleisincludedinthecensus.SectionIIChapter66-1A.C.E.RevisionIIEstimationU.S.CensusBureau,Census2000 Theinterpretationof r Mmaybelessobviousthantheothertwo;itisthesample-weightedproportionof P-samplepersonswhowerealsofoundinthecensus.The generalindependenceassumptionunderlyingDSEisthat eitherthecensusortheA.C.E.inclusionprobabilitiesare thesame(botharenotrequired).Assumingcausalinde-pendence,thematchrate r Mestimatestheprobabilityofcensusinclusionforthepost-stratum.Equation(2)alsogivesaninterpretationofhowtheDSEconstructspopulationestimateswithinapost-stratum.*Multiplythecensuscount(Cen)bythedata-defined rate, r DD,toestimatethenumberofcensuspersonswhoaredata-definedand,therefore,eligibleforinclu-sionintheEsample.*Reducethisproductbymultiplyingitbytheestimatedprobabilityofcorrectenumeration, r CE*Increasethisresultbydividingitbytheestimatedprob-abilityofcensusinclusion, r MTheprimarytasksindevelopingDSEsatthepost-stratumlevelaretheestimationofthethreeratesinvolved.The estimate r DDisstraightforwardbecauseitisbasedon100-percentcensustabulations.Moredetailisprovidedfortheestimates r CE and r M,sincetheyaremorechalleng-ing.Thedifferentestimationtaskscanbetackledonetermatatime.Basically,thegoalistoestimatethenumeratorsanddenominatorsoftheterms r CE and r M.Since E,theesti-matednumberoftotalcensusdata-definedenumerations,isasimple,directsample-weightedestimate,thechal-lengesrelatemostlytodevelopingtheestimates CE , P , and M.TheestimationchallengesforA.C.E.RevisionIIfocusonaccountingfor:(i)informationfromtherevisedcodingoftheA.C.E.RevisionSample,(ii)informationfromtheA.C.E.RevisionIIstudyofcensusduplicates,and(iii)differentpost-stratificationschemesfortheFullEandPsamples.Themostdifficultissueis(ii).BeforeproceedingtoadetaileddiscussionoftheA.C.E.RevisionIIDSEcomponents,considerthegeneralnatureoftheestimator.WhilethebasicDSEshowninequation(1) wasappliedinthe1990PES(Hogan,1993),theMarch2001A.C.E.incorporatedthemodificationcalledPES-Cestimation.SeeHaines(2001)andMule(2001b)for details.ThisDSEhadthegeneralform:
DSE CC en'II'CE EP nmP im M nmM om P om P im (3)wherethefollowingquantitiesareallP-sampleweightedestimatesforthegivenpost-stratum:
M nm=estimateofmatchestocensuspersonsfor nonmovers M om=estimateofmatchestocensuspersonsfor outmovers P nm=estimateoftotalnonmovers P om=estimateoftotaloutmovers P im=estimateoftotalinmoversNonmovers,outmovers,andinmoversweredefinedwithreferencetotheirstatusintheperiodoftimebetweenCensusDay(April1,2000)andtheA.C.E.interview.Non-moverswerethosewhodidnotmoveduringthisperiod,outmoverswerethosepersonswhomovedoutofasampleblockduringthisperiod,andinmoversarethose whomovedintoasampleblockduringthisperiod.Equa-tion(3)estimatedP-samplematches(M)asthesumofestimatedmatchesamongnonmovers(M nm)andesti-matedmatchesamongmovers.Thenumberofmovermatcheswasestimatedastheproductofanestimated numberofmovers(P im)andanestimateofthemovermatchrate(M om/P om).Thus,P-sampleoutmoverswereusedtoestimatethemovermatchratewhileP-Sample inmoverswereusedtoestimatethenumberofmovers.
Thisapproachimpliesthat P nm+P imshouldbeusedfortheestimatedtotalofP-samplepersons(P).Equation(3)canbefurtherexpandedtoincludepost-stratificationsubscripts.TheFullE-andP-samplepost-strataaredenotedbysubscripts i and j,respectively.Thecensustermwascalculatedforthecross-classificationof i and jpost-strata,denoted ij.TheDSEformula,usingpro-cedureCformovers,withdifferentpost-stratafortheEandPsamplesis:
DSE C ijC en ijr DD, ij[CE i E i][M nm , j[M om , j P om , j]P im , j P nm , jP im , j]ESTIMATIONOF r DDRecallthegeneralformoftheDSEinequation(2).Thissectiondiscussestheestimationofthedata-definedrate,orDD-rate.TheDD-rateestimate (r DD)isdefinedas (C en'II'C enforagivendetailed ijpost-stratum,where C en',II', and Cenaredefinedfrom100-percentcensustabulations.Atthepost-stratumlevel, C enr DDreducesto C en'II'.Thissuggeststhatanalternativetocomputing r DDatapost-stratumlevelistocompute C en'II'foralllevels(e.g.,demographicpost-stratumgroupswithinsmallgeo-graphicareas)forwhichestimatesweretobecomputed,andthentoadjustthesequantitiesbytheappropriate r CE/r Mfactors.Thisapproachmaybeproblematic,espe-ciallywhenappliedtoverysmallareas.Theproblemwithdirectcomputationof C en'II'forverysmallareascanbeseenwiththefollowinghypothetical example.Supposeaparticularsmallgeographicarea(e.g.,6-2SectionIIChapter6A.C.E.RevisionIIEstimationU.S.CensusBureau,Census2000 acollectionofblocks)hasahighrateofimputationinthecensus,say15.0percent.Imputationrateswillvarygeo-graphically,andhighratescouldresultfromanumberof factors,suchasdifficultiesgettingaccesstohousingunits insecurecommunitiesordifficultiesinhiringandretain-ingcensusenumeratorsinaparticulararea.Inthishypo-theticalexample,removingallimputationsfromthecen-suscountfortheareabycomputing C en'II'wouldreducethecensuscountby15.0percent.Subsequentmul-tiplicationbythe r CEr Mfactorsandsummingtheresult-ingDSEsoverpost-stratamayincreasethepopulationesti-matefromthisbase,butperhapsbynomorethantwoor threepercent(dependingonthepost-stratumcompositionofthearea).ThenetsyntheticDSEwould,thus,be12.0or13.0percentlowerthanthecensuscount.Whilethisesti-matecouldmakesenseifalmostallthehousingunitsforwhichpersonswereimputedwereactuallyvacant(andthisfactwerenotdiscoveredinthecensusenumeration),
itwouldnotmakesenseifmostoftheunitswereoccu-piedandthehighrateofimputationresultedfromotherfactorssuchasthosesuggestedabove.Calculating r DD forpost-strataandapplyingitsyntheticallyavoidssuchprob-lemsinsmallareaestimates,thoughperhapsincurring someerrorforlargerareasforwhichthedirecttabulation of C en'II'wouldbesensible.Thedata-definedrates, r DD,arecomputedatthedetailedpost-stratumobtainedastheintersectionoftheE-andP-samplepost-strata.ESTIMATIONOF r CEThissectiondiscussestheestimationofthecorrectenu-merationrate,r CE=CE/E.TheFullE-samplepost-strataaredenotedbythesubscript i.TheRevisionEsamplehaspost-stratadenotedby i',where i'isbasedoncollapsed post-strata i.ThismeansthattheRevisionSamplepost-stratawereobtainedbycollapsingtheFullSamplepost-strata i.Thecorrectenumerationrateiswritten:
r CE, iCE i ND f l , i'CE~i D E i (4)NotethatthenumeratortermseparatestheE-sampleenu-merationswithaduplicatelinktoacensusenumeration outsidetheA.C.E.searcharea,asidentifiedinthedupli-catestudy,fromthoseenumerationswithoutalink.AsdiscussedinChapter5,theduplicatestudyused computer-basedrecordlinkagetechniquestomatchtheFullP-andE-samplestocensusenumerationsoutsidethesearcharea.Thecensusenumerationsincludedthoseenu-merationsthatwereaddedtoolatetobeincludedintheE sample,aswellasthoseenumerationsthatweredeter-minedtobeduplicatesand,therefore,wereneverincludedinthecensus.Theterm CE i NDestimatesthenumberofcorrectenumera-tionsintheFullEsamplewithoutduplicatelinksinpost-stratum i.Thistermincludestheprobabilityofnotbeinga duplicate, 1-p t.Thecomponent CE~i DrepresentstheestimatednumberofcorrectenumerationsintheFullEsamplewithduplicate linksinpost-stratum i,whichareretainedafterunduplica-tion.Thistermincludestheprobabilityofbeingadupli-
cate, p t,aswellastheconditionalprobabilitythatanE-samplecaseisacorrectenumerationgiventhatitisaduplicatetoanothercensusenumerationoutsidethe A.C.E.searcharea.Thetotalweightednumberofpersonsinpost-stratum i intheEsamplearedenotedby E i.Thedouble-samplingratiofactor f 1, i'correctsformeasure-menterrorbasedontheRevisionEsample.Itisaratioofanestimatethatusestherevisedcoding(indicatedby*)
toanestimatethatusestheoriginalcoding.Theseadjust-ments,whicharecalculatedformeasurementerrorpost-strata i',arerepresentedby:
f l , i'CE i'ND*E i'ND CE i'ND E i'NDCE i'ND*CE i'ND.P-andE-samplecaseswithduplicatelinkswereassignedanonzeroprobabilityofbeingaduplicate, p t.P-andE-samplecaseswithoutduplicatelinkswereassigneda p tvalueofzero.Thisprobabilityisusually0or1forE-andP-samplecases,butsomeduplicatelinkshaveavaluein between,indicatinglessconfidencethatthelinkisrepre-sentingthesameperson.TheseprobabilitiesarealsotransferredtotheE-andP-RevisionSamples.AlthoughtheduplicatestudyidentifiedE-andP-samplecaseslinkingtocensusenumerationsoutsidetheA.C.E.searcharea,thisstudycouldnotdeterminewhichcompo-nentofthelinkwasthecorrectonesincenoadditionaldatawerecollectedforthispurpose.Assumingthatthelinkedpersondoesexist,thegoalistodeterminewhichof thetwolocationsistheappropriateplacetocounttheper-son.Sincelinkedpersonsmaybegeographicallycloseorfarapart,thishasimplicationsforthedegreeofsyntheticerror.OntheE-sampleside,thisstudydoesnotidentify whetherthelinkedE-samplecaseisthecorrectenumera-tion.Thus,itisnecessarytoestimatethefollowingcondi-tionalprobability:
z ttheprobabilitythatanE-samplecaseisacorrectenumerationgiventhatitisaduplicatetoanothercensusenumerationoutsidetheA.C.E.searcharea.E-SampleLinksFromtheduplicatestudy,anestimateofcorrectcensusenumerationscanbederivedbyconsideringthesituationofthelinkedenumerations,aswellasassumingthateach linkrepresentsonecorrectenumeration.Thisassumes,ofcourse,thatthelinkconsistsoftrueduplicates.TheseSectionIIChapter66-3A.C.E.RevisionIIEstimationU.S.CensusBureau,Census2000 assumptionsareusedtoestimatethecontributiontocor-rectenumerationsfromFullE-samplecaseswithduplicate links,includingthoseoriginallycodedascorrect,aswell asthoseoriginallycodedaserroneous.Thiscontributiontocorrectenumerationsisgivenbytheterm:
CE~i D.Toesti-matethisterm,theE-samplelinksarefirstclassifiedaccordingtothecharacteristicofthelinkedsituationand theoriginalcodingoftheEsample.Attachment1summa-rizesthisclassificationandtherulesforassigning z ts.First,linkedsituationsareidentifiedwhereonecompo-nentofthelinkisthoughttobecorrectandtheotherincorrect.Ifapersoninahousingunitlinkswithapersoninagroupquarters,suchasacollegedormitory,theper-soninthehousingunitistakentobeincorrectandassigneda z tofzero.SeeLinkedSituation1.inAttachment1.Ifalinkedperson18yearsofageorolderislistedin onlyoneofthehouseholdsasachildofthereferenceper-son,thispersonisassumedtobeincorrectlyincludedwiththeirparentsandcorrectlyincludedintheother household,unlessA.C.E.hadalreadydeterminedthemtobeanerroneousinclusion.Anexampleofthismightbeacollegestudentthatwaslistedwiththeirparentsandalso listedinanoff-campusapartment.Thisisrepresentedby LinkedSituations2a.and2b.inAttachment1.ForotherLinkedSituations,thechoiceofwhichpersoniscorrectisnotclear.Considerlinksbetweenwholehouse-holdswhereallhouseholdmembersareduplicated(LinkedSituation3.).Thisincludesfamiliesthatmight havemovedsometimearoundCensusDayandwereinad-vertentlyincludedatbothplacesorthismightinvolvehouseholdswithmultipleresidenceswithahelpful,but perhaps,uninformedproxyrespondent.Anothersituation,LinkedSituation4.,involveschildrenages0to17,per-hapsofdivorcedparents,thatarelinkedbetweentwodif-ferenthouseholds.Fortheseandallothersituations,itis assumedthatonlyhalfofthesecensusenumerationswithduplicatelinksarecorrect.Toestimatetheconditionalprobability, z t,thattheE-samplepersonisthecorrectenu-meration,controlscellsaredefinedforLinkedSituations3.,4.,and5.,asindicatedinAttachment1,by:*3Race/HispanicOriginDomains*Tenure TheseresultingcontrolcellsaregiveninAttachment2.Withineachcontrolcellthe z tsaredeterminedsuchthatduplicateE-samplecases,originallycodedcorrectorunre-solved,willweightuptoonehalfthenumberofcensus duplicatesidentified,includingtheerroneousenumera-tions.Thisiscalculatedas:
zt0.5t W t p tt W t p t P rCEThesummationsareoverthelinksinacontrolcellregard-lessoftheoriginalE-samplecoding.Thecomponentsofequation(4)aredefinedbelow.
CE~i Dti WE , t p t z t PR ce, tistheestimatednumberofcorrectenumerationswithduplicatelinksinpost-stratum iwhowereretainedafter unduplication.
CE i NDti W, t E1p tPR ce, tisthenumberofcorrectenumerationswithoutduplicatelinksinpost-stratum i,wherethesummationistakenoverallenumerationsintheA.C.E.Esampleinpost-stratum i.W, t EistheproductionA.C.E.samplingweightforE-sampleperson t.p tistheprobabilitythatperson thasaduplicatelinkoutsidethesearcharea.Thisisusually0or1,butcouldbebetweenthesetwovaluesfor probabilitymatches,wheretheaccuracyofthe linkwasuncertain.
PR ce, tistheprobabilitythatperson tisacorrectenu-merationintheoriginalproductioncoding.This iseither0or1unlessitwasnotpossibletocode theE-samplecaseacorrectorerroneousenu-meration.Inthesecases,aprobabilityofcorrectenumerationwasimputed.
f l , i'CE i'ND*CE i'NDti'W RR, t E1p tPR ce R, tti'W R, t E1p tPR ce, twhere W RR, t EistheA.C.E.RevisionSampleweightforper-son ttobeusedforRevisionSamplecoding.
W R, t EistheA.C.E.RevisionSampleweightforper-son ttobeusedwithproductioncoding.ThesetwoweightscoulddifferslightlydependingonTESstatusandnoninterview
adjustment.
PR ce R, tistheprobabilitythatperson tisacorrectenumerationintheA.C.E.RevisionSample coding.E iti W, t EisthetotalweightednumberofpersonsintheEsamplein post-stratum i.ESTIMATIONOF r MThissectiondiscussestheestimatedmatchrateinequa-tion(2).E-samplepost-strataareindexedby i,whiletheP-samplepost-strataareindexedby j.Thematchratefor post-stratum jisrepresentedas:
r M, jM nm , j ND f 2, j'M~nm , j D[M om , j f 3, j'P om , j f 4, j'](P im , j f 5, j'g (P nm , j DP~nm , j D))P nm , j ND f 6, j'P~nm , j DP im , j f 5, j'g (P nm , j DP~nm , j D)(5)6-4SectionIIChapter6A.C.E.RevisionIIEstimationU.S.CensusBureau,Census2000 TheresidencestatusofP-samplemoverswasadjustedforcodingerror.Thecomputermatchingresultswerenot used.OutmoversinthePsamplewerecollectedbya proxyinterview,whichmadeitdifficulttoobtaindateof birthandageinformation.Sincedateofbirthandage wereimportantcharacteristicsusedinthecomputer matching,themoverswereonlyadjustedforcodingerror.AlthoughtheduplicatestudyidentifiedE-andP-samplecaseslinkingtocensusenumerationsoutsidetheA.C.E.searcharea,thisstudycouldnotdeterminewhichcompo-nentofthelinkwasthecorrectone,sincetherewereno additionaldatacollectedtodeterminethis.Assumingthatthelinkedpersondoesexist,thegoalistodeterminewhichofthetwolocationsistheappropriateplaceto counttheperson.Sincelinkedpersonsmaybegeographi-callycloseorfarapart,thishasimplicationsforthedegreeofsyntheticerror.OntheP-sampleside,thisstudydoesnotidentifywhetherthelinkedP-samplecaseisaresidentonCensusDay.Thus,itisnecessarytoestimatethefollowingconditional probability:
h tistheprobabilitythataP-samplecaseisaresidentonCensusDaygiventhatitlinkstoacensusenumera-tionoutsidetheA.C.E.searcharea.P-SampleLinksUnliketheE-sampleside,theduplicatestudydoesNOTprovideanestimateofthenumberofcorrectCensusDayresidentsinthePsample.Inordertoestimate h ttheprob-abilitythataP-samplecaseisaresidentonCensusDaygiventhatitlinkstoacensusenumerationoutsidethe searcharea,itisnecessarytoborrowtheresulting z tsfromtheE-samplelinks.Attachment1summarizeshow
the h tsborrowinformationfromthe z ts.First,theP-samplelinkstocensusenumerationsoutsidethesearchareaareidentifiedforsituationswhereitcanbedeterminedwhichcomponentofthelinkisthecorrect residence.TheLinkedSituationsandrulesforassigning h tsarethesameasthoseusedforcomparabletypesofE-samplelinks.Forexample,consideraP-sampleperson18yearsofageorolder,listedasachildofthereferencepersonwholinkswithacensusenumerationinahouse-holdwheretheyarenotlistedasachild.ThisP-samplepersonwouldbeassignedan h tofzeroregardlessofhowA.C.E.codedthisperson.Thus,itisassumedthatthisper-sonshouldnothavebeenincludedinthePsample.FortheotherLinkedSituations3.,4.,and5.,thereonceagainisnoinformationtodeterminewhetherthePsample hadthepersonatthecorrectlocationorwhetherthecen-sushadthematthecorrectlocation.Additionally,thereisnoreasonableassumptionabouthowmanyofthese linkedP-samplepersonsshouldbeatthecorrectlocation.Toovercomethisobstacle,itisassumedthattheerrorinidentifyingcorrectresidenceissimilartotheerroriniden-tifyingcorrectenumerationforsimilarsituations.There-fore,the h tforP-samplepersonsissetequaltothe z tdeter-minedfortheEsampleforcomparablelinkedsituations asidentifiedbythecontrolcellsinAttachment2.The h tsarethenincludedintheweightedtallies,alongwiththe p t ,tocalculatetheduplicatecontributiontotheFullP-sample nonmoversandnonmovermatches.Thetermsinequation(5)aredefinedbelow.Summation tjdenotessummationoverA.C.E.FullP-Samplepost-stratumj,whilesummation tj'denotessummationoverRevisionSamplepost-stratum j'.Thesummationnotationalsoindicateswhetherthesumistakenovernonmovers,outmovers,orinmovers,andiftheProduction
()orRevi-sion(R)Samplecodingisused.
M nm , j NDW, t P (1p t)PR res, t PR m, t tjtnonmover productionwhere W, t PistheP-sampleproductionweightofperson t.p tistheprobabilitythatperson thasaduplicatelinkoutsidethesearcharea.
PR m, tistheprobabilitythatperson tisamatchintheproductioncoding.
PR res, tistheprobabilitythatperson tisaresidentintheproductioncoding.
f 2, j'M nm , j'ND*M nm , j'NDW RR, t P1p tPR res R, t PR m R, t tj'tnonmover revisionW R, t P1p tPR res, t PR m, t tj'tnonmover productionisthedouble-samplingadjustmentfornonmovermatches.
PR m R, tistheprobabilitythatperson tisamatchintheRevisionSamplecoding.
PR res R, tistheprobabilitythatpersontisaresidentintheRevisionSamplecoding.
W RR, t PistheA.C.E.RevisionSampleweightforperson ttobeusedforRevisionSamplecoding.
W R, t PistheA.C.E.RevisionSampleweightforperson ttobeusedwithproductioncoding.ThesetwoweightscoulddifferslightlydependingonTESstatusandthenoninterviewadjustment.
M om , jW, t P PR res, t PR m, t tjtoutmover productionisthenumberofmatchedoutmoversintheFullSamplein post-stratum j.SectionIIChapter66-5A.C.E.RevisionIIEstimationU.S.CensusBureau,Census2000 f 3, j'M om , j'*M om , j'W RR, t P PR res R, t PR m R, t tj'toutmover revisionW R, t P PR res, t PR m, t tj'toutmover productionisthedouble-samplingratioformatchedoutmoversfor post-stratum j'.P om , jW, t P PR res, t t, jtnonmover productionisthenumberofoutmoversintheFullSampleforpost-stratum j.f 4, j'P om , j'*P om , j'W RR, t P PR res R, t tj'toutmover revisionW R, t P PR res, t tj'toutmover productionisthedouble-samplingratioforoutmoversforpost-stratum j'.P im , jW, t P tjtnonmover productionisthenumberofinmoversintheFullSamplepost-stratum j.f 5, j'P im , j'*P im , j'W RR, t P PR inmover R, t tj'tinmover revisionW R, t P tj'tinmover productionisthedouble-samplingratioforinmoversforpost-stratum j'.PR inmover R, tistheprobabilitythatperson tintheRevi-sionSampleisaninmover.
g (P nm , j DP~nm , j D)Theterm gadjuststhenumberofinmoversforthoseFullP-samplenonmoverswhoaredeterminedtobenonresidentsbecauseofduplicatelinks.Someofthesenonresidentsarenonresidentsbecausetheyareinmoversandshouldbeadded tothecountofinmovers.Theterm P nm , j DP~nm , j Disanestimateofnonresidentsamongnonmoverswithduplicatelinks.Thistermismultipliedby g ,whichisanestimateoftheproportionoforiginally-codednon-moverswithduplicatelinkswhoaretruenonresidentsthathave movedinsinceCensusDay.Theterm gisestimatedusingtheRevisionSampleandboththeoriginalA.C.E.andtherevision codingasfollows:
gP nm , im*D P nm , nr*D P nm , im*Disanestimateofpersons(usingtheRevisionPsample)withaduplicatelinkwhowereoriginallycodedasnonmoversbuttherevisioncodingdeter-minedthemtobeinmovers(asubsetofnonresi-dents).P nm , nr*Disanestimateofpersons(usingtheRevisionPsample)withaduplicatelinkwhowereoriginallycodedasnonmoversbuttherevisioncodingdeter-minedthemtobenonresidents.Acoupleofimportantassumptionsare:*Iftherevisioncodingdeterminedthatapersonwasanonresident,theyreallyareanonresident.Thatis, revision-codednonresidentsareassumedtobeasubsetoftruenonresidents.*Therateofinmoversforrevision-codednonresidentsisthesameasthatfortruenonresidents.
M~nm , j DW, t P p t h t PR m, t PR res, t tjtinmover productionisthenumberofduplicatepersonsdeterminedtohavebeenCensusDayresidentswhomatchedtothe censusinpost stratum j.P nm , j NDW, t P1p tPR res, t tjtinmover productionisthenumberofnonmoverswithoutlinksoutsidethesearchareainpost-stratum j.f 6, j'P nm , j'ND*P nm , j'NDW RR, t P1p tPR res R, t tj'tnonmover revisionW R, t P1p tPR res, t tj'tnonmover productionisthedouble-samplingadjustmentfornonmoversinpost-stratumj'.6-6SectionIIChapter6A.C.E.RevisionIIEstimationU.S.CensusBureau,Census2000 P~nm , j DW, t P p t h t PR res, t tjtnonmover productionistheestimatednumberofnonmoverpersonswithdupli-catelinkswhowereresidentsafterunduplication.
P nm , j DW, t P P t PR res, t tjtnonmover productionisthenumberofP-samplepersonswithduplicatelinks,regardlessofwhethertheyweredeterminedtoberesi-dentsbytheunduplicationprocess.THEA.C.E.REVISIONIIDSEFORMULATheA.C.E.RevisionIIDSEformula,usingprocedureCformovers,separateE-andP-samplepost-strata,measure-menterrorcorrectionsfromtheE-andP-RevisionSamples,andduplicatestudyresultsis:
DSE ij CC en ijr DD, ij[CE i ND f 1, i'CE~i D E i][M nm , j ND f 2, j'M~nm , j D[M om , j f 3, j'P om , j f 4, j']P im , j f 5, j'gP nm , j DP~nm , j DP nm , j ND f 6, j'P~nm , j D+P im , j f 5, j'gP nm , j DP~nm , j D]NotationTermsCECorrectenumerationsEE-sampletotal MMatches PP-sampletotal fAdjustsformeasurementerror
gAdjustsnonmoverstomoversdueto duplicationSubscriptsi,jFullEandPpost-stratai',j'RevisionEandPmeasurementerrorcorrectionpost-stratanm,om,imnonmover,outmover,inmoverSuperscriptsCDSEprocedureCformoversNDNotaduplicatetocensusenumerationoutsidesearchareaDDuplicatetocensusenumerationoutsidesearch areaIncludesprobabilityadjustmentforresidencygivenduplicationInsomesmallpost-strata,thenumberofinmoverswassubstantiallylargerthanthenumberofoutmovers.Ifthere wereonlyafewoutmovers,theoutmovermatchratewassubjecttohighsamplingerror.Inthesepost-strata,itwasnotconsideredappropriatetoapplyasuspectmatchrate towhatcouldbearelativelylargenumberofinmovers,soPES-Awasused.PES-Ausesonlyoutmovers.PES-Awasappliedforpost-stratawithnineorfewerP-sampleout-movers.Forthesepost-strata,itwasassumedthatsome oftheduplicatelinksdeterminednottohavebeenresi-dentswerereallyoutmovers.TheDSEformulathatusesprocedureAformoverswithdifferentpost-stratafortheE-andP-samplesis:
DSE ij AC en ijr DD, ijCE i E i[M nm , jM om , j P nm , jP om , j]TheA.C.E.RevisionIIDSEformula,usingprocedureAformovers,separateE-andP-samplepost-strata,measure-menterrorcorrectionsfromtheE-andP-RevisionSamples,andduplicatestudyresultsiswritten:
DSE ij AC en ijr DD, ij[CE i ND f 1, i'CE~i D E i][M nm , j ND f 2, j'M~nm , j DM om , j f 3, j'gM nm , j DM~nm , j DP nm , j ND f 6, j'P~nm , j DP om , j f 4, j'gP nm , j DP~nm , j D]ThisversionoftheformulaisusedonlywhenthesamplesizeforoutmoversintheFullPsampleisstrictlylessthan10.Thisformulawasused93timesintheA.C.E.RevisionIIproductionprocess.Thenewtermintroduced inthisformulaisdefinedasfollows:
M nm , j DW, t p p t PR res, t PR m, t tjtnonmover productionisthenumberofmatchedP-samplepersonswithduplicatelinks,regardlessofwhethertheyweredeterminedtoberesidentsbytheunduplicationprocess.A.C.E.REVISIONIIPOST-STRATIFICATIONDESIGNTheFullE-andP-sampleswiththeoriginalcodingresultsthatwereusedtoproducetheMarch2001estimatesofcensuscoverageprovidedthebasisoftheA.C.E.
RevisionIIestimates.TheMarch2001A.C.E.estimatesweredeterminedtobeunacceptablebecauseofthepres-enceoflargeamountsofmeasurementerror.TheseFull sampleswerecomprisedofover700,000samplepersons each.Insteadofonesetofpost-stratificationvariables,theA.C.E.RevisionIIestimatesincludeseparatepost-stratafortheFullEandPsamples,indicatedbysubscripts i and j ,respectively.FullPSampleFortheFullPsample,thenewpost-stratawerenearlyidenticaltothoseusedfortheMarch2001A.C.E.esti-mates.Theonlydifferencewasthatthe0-17agegroupSectionIIChapter66-7A.C.E.RevisionIIEstimationU.S.CensusBureau,Census2000 wassplitintotwogroups,0-9and10-17,whichresultedinsomecollapsingdifferences.TheFullPsample,consist-ingof480post-strata,wasbasedonthefollowingcharac-teristics(asopposedtotheprevious416post-strata):*Race/HispanicOriginDomain
- Tenure*SizeofMetropolitanStatisticalArea*TypeofCensusEnumerationArea*ReturnRateIndicator(Lowvs.High)*Region
- Age*Sex FortheFullPsample,thepost-stratumgroupseitherretainedalleightAge/SexcategoriesorwerecollapsedtofourAge/Sexcategoriesasshownbelow:Figure6-1.P-SampleAge/SexGroupings Age8groups4groups1group*MaleFemaleMaleFemaleMaleFemale 0-9 10-17 18-29 30-49 50+*The1groupisnotusedfortheFullP-samplepost-strata(j),onlytheRevisionP-samplepost-strata(j').Table6-1showsthe64FullP-samplepost-stratumgroups.Thenumberineachcellrepresentsthenumberof Age/Sexcategoriesineachpost-stratumgroup.6-8SectionIIChapter6A.C.E.RevisionIIEstimationU.S.CensusBureau,Census2000 Table6-1.FullP-SamplePost-StratumGroupsandNumberofAgeandSexGroupings(j)Race/HispanicorigindomainnumberTenureMSA/TEAHighreturnrateLowreturnrateNEMWSWNEMWSWDomain7(Non-HispanicWhiteor Someotherrace)
OwnerLargeMSAMO/MB88888484MediumMSAMO/MB88884888SmallMSA&Non-MSAMO/MB88884888AllotherTEAs88888888 NonownerLargeMSAMO/MB88MediumMSAMO/MB88 SmallMSA&Non-MSAMO/MB88 AllotherTEAs88Domain4(Non-HispanicBlack)
OwnerLargeMSAMO/MB 88MediumMSAMO/MBSmallMSA&Non-MSAMO/MB 88AllotherTEAs NonownerLargeMSAMO/MB 88MediumMSAMO/MBSmallMSA&Non-MSAMO/MB 84AllotherTEAsDomain3 (Hispanic)
OwnerLargeMSAMO/MB 88MediumMSAMO/MBSmallMSA&Non-MSAMO/MB 88AllotherTEAs NonownerLargeMSAMO/MB 88MediumMSAMO/MBSmallMSA&Non-MSAMO/MB 84AllotherTEAsDomain5(NativeHawaiianorPacificIslander)
Owner 4 Nonowner 4Domain6(Non-HispanicAsian)
Owner 8 Nonowner 8 AmericanIndianor Alaska NativeDomain1 (On Reservation)
Owner 8 Nonowner 8Domain2(Off Reservation)
Owner 8 Nonowner 8SectionIIChapter66-9A.C.E.RevisionIIEstimationU.S.CensusBureau,Census2000 FullESampleFortheA.C.E.RevisionIIFullEsample,thepost-stratadefinitionshaveundergonemajorrevisions.Someofthe originalpost-stratificationvariableswereomittedand additionalvariableswereadded.Logisticregressionmod-elsidentifiedseveralvariables,notincludedintheFull P-samplepost-stratification,thatweregoodindicatorsof correctenumeration.TheFullEsample,consistingof525 post-strata,wasdefinedusingthefollowing
characteristics:*ProxyStatus
- Race/HispanicOriginDomain*Tenure*HouseholdRelationship*HouseholdSize*TypeofCensusReturn(mailbackvs.nonmailback)
- DateofReturn(earlyvs.late)*Age*SexThenewvariablesproxystatus,householdrelationshipandsize,andtype(mailback/nonmailback)anddate (early/late)ofcensusreturnaredescribedgenerallybelow.
- ProxyStatus.Nonproxyincludesthosehousingunitpersonsforwhomcensusdatawereprovidedbyahouseholdmember.Proxyincludesthosehousingunitpersonsforwhomcensusdatawereprovidedbyanon-householdmember,suchasaneighbororrentalagent.
- HouseholdRelationship.TheHouseholder/Nuclear(HHer/Nuclear)relationshipcategoryincludespersonsinhousingunitsconsistingonlyofthehouseholderwith spouseorownchildren(l7oryounger).TheOtherrelationshipcategoryconsistsofsingle-personhouse-holdsandpersonsinhousingunitswithanyothertype ofrelationship,includingunrelatedpersons.
- HouseholdSize.Householdsize,ornumberofper-sonsresidinginthehousingunit.
- Early/LateMailback.Personsinmailbackhousingunitswithanearliestformprocessingdate.OnorbeforeMarch24isearlyandafterMarch24islate.
- Early/LateNonmailback.Personsinnonmailbackhousingunitswithanearliestformprocessingdate.OnorbeforeJune1isearlyandafterJune1islate.FortheFullEsample,thepost-stratumgroupseitherretainedalleightAge/Sexcategoriesorwerecollapsedto four,two,oroneAge/Sexgroups,basedonsamplesizes,asshownbelow:Figure6-2.E-SampleAge/SexGroupings Age8groups4groups2groups1groupMaleFemaleMaleFemaleMaleFemaleMaleFemale 0-9 10-17 18-29 30-49 50+Table6-2showsthe93FullE-samplepost-stratumgroups.ThenumberineachcellrepresentsthenumberofAge/Sexcategoriesineachpost-stratumgroup.6-10SectionIIChapter6A.C.E.RevisionIIEstimationU.S.CensusBureau,Census2000 Table6-2.FullE-SamplePost-StratumGroupsandNumberofAgeandSexGroupingsProxystatus&domainTenureRelationshipHHSize Early mailback Late mailback Early non-mailbackLatenon-mailbackProxy:Domain7(Non-HispanicWhiteorSomeOtherRace)8Proxy:Domain4(Non-HispanicBlack) 8Proxy:Domain3(Hispanic) 8Proxy:Domain5(NativeHawaiianorPacificIslander) 1Proxy:Domain6(Non-HispanicAsian) 4Proxy:Domain1(AmericaIndianorAlaskaNativeOnReservation)4 Proxy:Domain2(AmericanIndianorAlaskaNativeOffReservation)1 Nonproxy:Domain7 (Non-HispanicWhiteorSomeOtherRace)
Owner HHer/Nuclear2-388884+8848 Other122122-38824 4+8848 NonownerHHer/Nuclear8888Other8888 Nonproxy:Domain4 (Non-HispanicBlack)
OwnerHHer/Nuclear4424Other8848 NonownerHHer/Nuclear8888Other8888 Nonproxy:Domain3 (Hispanic)
OwnerHHer/Nuclear8848Other8848 NonownerHHer/Nuclear8888Other8888 Nonproxy:Domain5(NativeHawaiianorPacificIslander)Owner&NonownerHHer/Nuclear2222Other2212 Nonproxy:Domain6 (Non-HispanicAsian)Owner&NonownerHHer/Nuclear8844Other4424 Nonproxy:(AmericanIndianorAlaskaNative)Domain1OnReservationOwner&NonownerHHer/Nuclear8Other8Domain2OffReservationOwner&NonownerHHer/Nuclear2222Other2212SectionIIChapter66-11A.C.E.RevisionIIEstimationU.S.CensusBureau,Census2000 RevisionPSampleTheRevisionPsampleisasubsampleoftheFullPsampleandiscomprisedofover60,000samplepersons.The RevisionPsamplehasbeensubjectedtoanadditional fieldinterviewand/orrematchingoperationaspartofthe originalA.C.E.evaluationprogram.Insupportofthe A.C.E.RevisionIIprogram,theRevisionPsamplehas undergoneextensiverecodingusingallavailableinterview dataandmatchingresults.Missingdataadjustmentshave alsobeenappliedtotheRevisionPsample.Thisrecoded dataareusedtocorrectformeasurementerrorintheFull Psample.Themeasurementerrorcorrectionpost-stratumdefinitions (j')dependonapersonsmoverstatus.BothinmoversandoutmoversaresubdividedintoOwnerandNonowner groups.Fornonmovers,themeasurementerrorcorrection post-strataare:AmericanIndiansonReservations(AIR) and,fortheNon-AIRcases,acrossofTenure(Ownerver-susNonowner)witheightAgeandSexcategories.The Age/SexcollapsingpatternfromtheFullPsampleis retainedwhendefiningthemeasurementerrorcorrection post-strata.TheRevisionP-samplepost-strata(j')aredefinedasfollows:Figure6-3.RevisionP-SamplePost-Strata(j')MoverStatus&DomainTenureAge8groups1groupMaleFemale Movers:Domains1thru7 Owner Nonowner Nonmovers:Domains2thru7 Owner 0-9 N/A 10-17 18-29 30-49 50+Nonowner 0-9 N/A 10-17 18-29 30-49 50+Nonmovers:Domain1(AmericanIndianorAlaskaNativeOnReservation)N/Ameansnotapplicable.6-12SectionIIChapter6A.C.E.RevisionIIEstimationU.S.CensusBureau,Census2000 RevisionESampleTheRevisionEsampleisasubsampleoftheFullEsampleandiscomprisedofover75,000samplepersons.The RevisionEsamplehasbeensubjectedtoanadditional fieldinterviewand/orrematchingoperationaspartofthe originalA.C.E.evaluationprogram.Insupportofthe A.C.E.RevisionIIprogram,theRevisionEsamplehasundergoneextensiverecodingusingallavailableinterviewdataandmatchingresults.Missingdataadjustmentshave alsobeenappliedtotheRevisionEsample.TheserecodeddataareusedtocorrectformeasurementerrorintheFullESample.FortheRevisionEsample,themeasurementerrorcorrec-tionpost-strataare:Proxies,AmericanIndiansonReserva-tions(AIR)and,fortheNonproxy/Non-AIRcases,across ofatwo-levelRelationshipvariablewitheightAge/Sexcategories.NotethatHouseholdSizeiscollapsedoutoftheHouseholdRelationship/Sizevariable.TheAge/SexcollapsingpatternfromtheFullEsampleisretainedwhendefiningthemeasurementerrorcorrectionpost-strata.The RevisionEsamplepost-strata(i')aredefinedasfollows:Figure6-4.RevisionE-SamplePost-Strata(i')ProxyStatus&DomainRelationshipAge8groups1groupMaleFemale Proxy:Domain7(Non-HispanicWhiteorSomeOtherRace)
Domain4(Non-HispanicBlack)
Domain3(Hispanic)
Domain5(NativeHawaiianorPacificIslander)
Domain6(Non-HispanicAsian)
Domain1(AmericanIndianorAlaskaNativeOnReservation)
Domain2(AmericanIndianorAlaskaNativeOffReservation)
Nonproxy:Domains2thru7 HHer/Nuclear 0-9 N/A 10-17 18-29 30-49 50+Other 0-9 N/A 10-17 18-29 30-49 50+Nonproxy:Domain1(AmericanIndianorAlaskaNativeOnReservation)N/Ameansnotapplicable.SectionIIChapter66-13A.C.E.RevisionIIEstimationU.S.CensusBureau,Census2000 ADJUSTMENTFORCORRELATIONBIASUSINGDEMOGRAPHICANALYSISThedualsystemestimatesareadjustedtocorrectforcor-relationbias.Correlationbiasexistswhenevertheprob-abilitythatanindividualisincludedinthecensusisnotindependentoftheprobabilitythattheindividualis includedintheA.C.E.Thisformofbiasgenerallyhasadownwardeffectonestimates,becausepeoplemissedinthecensusmaybemorelikelytoalsobemissedinthe A.C.E.Estimatesofcorrelationbiasarecalculatedusingthetwo-groupmodelandsexratiosfromDemographicAnalysis(DA).Thesexratioisdefinedasthenumberof malesdividedbythenumberoffemales.Thismodelassumesnocorrelationbiasforfemalesorformalesunder18yearsofage;nocorrelationbiasadjustmentfor non-Blackmalesaged18-29;andthatBlackmaleshavearelativecorrelationbiasthatisdifferentthantherelativecorrelationbiasfornon-Blackmales.Thecorrelationbias adjustmentisalsodonebythreeagecategories:18-29, 30-49,and50andover.Thismodelfurtherassumesthatrelativecorrelationbiasisconstantovermalepost-stratawithinagegroups.TheRace/HispanicOriginDomainvari-ableisusedtocategorizeBlackandnon Black.TheDAtotalsareadjustedtomakethemcomparablewithA.C.E.Race/HispanicOriginDomains.BlackHispanicsaresubtractedfromtheDAtotalforBlacksandaddedtothe DAtotalfornon-Blacks.ThisisdonebecausetheA.C.E.assignsBlackHispanicstotheHispanicdomain,nottheBlackdomain.Thesecondadjustmentdeletesgroupquar-terspeoplefromtheDAtotalsusingCensus2000data.
ThereasonformakingthisadjustmentisthatthegroupquarterspopulationisnotpartoftheA.C.E.universe.Afinaladjustmentthatcouldbemadewouldbetoremove theRemoteAlaskapopulationfromtheDAtotals,sinceittooisnotpartoftheA.C.E.universe.Sincethispopulationissmall,theDAsexratioswouldnotbeaffectedinany meaningfulway.TheresultingDAsexratiosforthethreeagegroupsbyBlackandnon-BlackdomainareshowninAttachment3.Ingeneralthecorrelationbiasadjustmentfactor, c k,isdefinedfor k=3agegroupssuchthat:
E[c k DSE k m]Truemalepopulationforagegroup k ,where DSE k misthesumofDSEsovermalepost-stratainagegroup k.SincethepurposeofthisadjustmentistoreflectpersonsmissedinboththecensusandtheA.C.E.,thevalueof c kwasnotallowedtobelessthanone.CorrelationBiasAdjustmentforBlackandNon-BlackMales18YearsandOlderThecorrelationbiasadjustmentforBlackandnon-Blackmales18yearsandolderisdonesothattheA.C.ERevi-sionIIsexratioswillagreewiththeDAsexratiosforBlacksandnon-Blacks.Thiscorrelationbiasadjustmentiscalculatedas:
c R, k(ijk DSE ij R fijk DSE ij R m)r DAR, kwhere DSE ij R f=DSEforrace,R=Blackornon-Black,female post-strata ij.DSE ij R m=DSEforrace,R=Blackornon-Black,malepost-strata ij.r DAR, k=DAsexratioforrace,R=Blackornon-Black,foragegroup kasgiveninAttachment3.Thesumoverthe ijpost-strataincludesonlytheintersec-tionofthosepost-stratawithagegroup k.DSEsAdjustedforCorrelationBiasAcorrelationbias-adjustedDSEforamale,18+post-stratum ijinage-racegroup kiscalculatedas:
DS~E ij mc k DSE ij mForallremainingpost-strata,whichincludesfemalepost-strataaswellaspost-strataforpersonsunder18yearsofage,nocorrelationbiasadjustmentisdone.Thus:
DS~E ij fDSE ij f The DS~E ijsarethenusedtoformsyntheticestimates.SYNTHETICESTIMATIONThecoveragecorrectionfactorsfordetailedpost-strata ijarecalculatedas:
CC~F ijDS~E ij C en ijwherethe DS~E ijarethecorrelationbias-adjustedDSEsfor post-stratum ij.Cen ijsarethecensuscountsforpost-stratum ij.Notethat this Cen ijincludeslatecensusadds.Acoveragecorrectionfactorwasassignedtoeachcensusperson,exceptthoseingroupquartersorRemoteAlaska.
Effectively,thesepersonshaveacoveragecorrectionfac-torof1.0.Indealingwithduplicatelinkstogroupquar-terspersons,thepersoninthegroupquarterwastreated asthecorrectenumeration,orthatthiswastheircorrectresidenceonCensusDay.Asyntheticestimateforanyareaorpopulationsubgroup bisgivenby:
N~bC en b , ij CC~F ij ijb6-14SectionIIChapter6A.C.E.RevisionIIEstimationU.S.CensusBureau,Census2000 Notethatthecoveragecorrectionfactorcanbeexpressed as: CC~F ij(DD ij C en ij)(r CE, i r M, j)c kwhere r CE, iisthecorrectenumerationratecomponentoftheDSE,varyingover i post-strata.
r M, jisthematchratecomponentoftheDSE,varying over j post-strata.
c kisthecorrelationbiasadjustmentfactor,varyingovertheBlackandnon-Blackgroupsand k age cells.DD ijC en ijisthedata-definedrate,varyingoverthe ij post-strata.SectionIIChapter66-15A.C.E.RevisionIIEstimationU.S.CensusBureau,Census2000 Attachment1.RulesforAssigning z t&h tforFullP-andE-SampleDuplicateLinksTheLinkedSituationsandassignmentof z tsand h tsoccurintheorderlistedbelow.Linkedsituation(EorP)(Census)OriginalEcoding z t OriginalPcoding h t1.(Personinahousingunit)(Personinagroupquarters)EE0NonRes0CE/UE0Res/UE02a.(Person18+,childofreferenceperson)(Person18+,notchildofreferenceperson)EE0NonRes0CE/UE0Res/UE02b.(Person18+,notchildofreferenceperson)(Person18+,childofreferenceperson)EE0NonRes0CE/UE1Res/UE13.(Allpersonsinahousingunit)(Allpersonsinanotherhousingunit)EE0NonRes0 CE/UE z1 Res/UE z14.(Child0-17)(Child0-17)EE0NonRes0 CE/UE z2 Res/UE z25.AllremaininglinkedsituationsEE0NonRes0 CE/UE z3 Res/UE z3EEiserroneousenumeration.CEiscorrectenumeration.
UEisunresolved.
ResisresidentonCensusDay.NonResisnotaresidentonCensusDay.6-16SectionIIChapter6A.C.E.RevisionIIEstimationU.S.CensusBureau,Census2000 Attachment2.ControlCellsforLinkedESampleRace/HispanicOriginDomainTenureLinkedsituationControlcellDomain4(Non-HispanicBlack)
Owner 3.4.5.Nonowner 3.4.
5.Domain3(Hispanic)
Owner 3.4.5.Nonowner 3.4.
5.Domain7(Non-HispanicWhiteorSomeOtherRace)Domain5(NativeHawaiianorPacificIslander)
Domain6(Non-HispanicAsian)
Domain1(AmericanIndianorAlaskaNativeOnReservation)
Domain2(AmericanIndianorAlaskaNativeOffReservation)
Owner 3.4.
5.Nonowner 3.4.5.SectionIIChapter66-17A.C.E.RevisionIIEstimationU.S.CensusBureauCensus2000 Attachment3.CorrelationBiasAdjustmentGroupingsandFactorsRace/HispanicOriginDomainAgeDAsex ratios Adjustment factor Black:Domain4(Non-HispanicBlack)18-290.901.0830-490.891.1050+0.761.05 Non-Black:Domain3(Hispanic)Domain7(Non-HispanicWhiteorSomeOtherRace)Domain5(NativeHawaiianorPacificIslander)
Domain6(Non-HispanicAsian)
Domain1(AmericanIndianorAlaskaNativeOnReservation)
Domain2(AmericanIndianorAlaskaNativeOffReservation)18-291.041.00*30-491.011.0250+0.861.01*Thisnumbersetto1.00duetotheinconsistencybetweenDAandA.C.E.RevisionIIresults.6-18SectionIIChapter6A.C.E.RevisionIIEstimationU.S.CensusBureauCensus2000 Chapter7.AssessingtheEstimates INTRODUCTIONTheevaluationsoftheA.C.E.RevisionIIestimatesmaybedividedintotwocategories.Onecategorycontainsthe evaluationsthatfocusonindividualerrorcomponents.TheothergroupconsistsofcomparisonsoftherelativeerrorbetweenthecensusandtheA.C.E.RevisionIIestima-tor.Thischapterprovidesabriefdescriptionoftheevaluationstudies.Thecomponenterrorsexaminedbyseparatestudiesaresamplingerror,errorfromimputationmodelselection,errorduetousinginmoverstoestimateout-moversinPES-C,syntheticerror,errorintheidentification ofthecensusduplicatesasdeterminedbyadministrativerecords,errorintheidentificationofcomputerduplicatesasdeterminedbyaclericalreview,errorfrominconsistent post-stratificationvariables,andpotentialerrorarisingfromtheautomatedcodingofsomecases,calledtheat-riskcoding,intheRevisionSample.Thecomparisonsof relativeerrorbetweenthecensusandtheA.C.E.RevisionIIestimatorincludeacomparisonwithDemographicAnalysis,theconstructionofconfidenceintervalsthat accountforbiasaswellasrandomerror,andlossfunction analyses.AlsointhiscategoryisanexaminationoftheconsistencyoftheestimatesofcoverageerrormeasuredbytheA.C.E.RevisionIIestimatorandtheHousingUnit CoverageStudy(HUCS).Althoughanadjustmentforcorre-lationbiasisincludedintheA.C.E.RevisionIIestimates,noevaluationsaddresstheerrorinthelevelofcorrelation biasorthemodelusedtodistributeitacrosspost-strata.Thereasonisthatexaminingalternativemodelsonlyaccountsfordifferencesinmodels.Thosedifferences wouldreflectthevariationsinhowtheseveralmodelscor-recttheoriginalDSEsforcorrelationbiases,butwouldnotreflectthepresenceorabsenceofcorrelationbiasinthecorrectedDSEs.SAMPLINGERRORSamplingerrorgivesrisetorandomerror,whichisquanti-fiedbysamplingvariance.Thesamplingvarianceispresentinanyestimatebasedonasampleinsteadofthewholepopulation.Thevarianceestimationmethodologyis asimplifiedjackknifewiththeblockclustersbeingthepri-marysamplingunit.Theeffectofwithin-clustersubsam-plingisimplicitlycapturedintheweighting.TheMarch2001A.C.E.datashowedthatthesimplifiedjackknifemethodproducessatisfactoryvarianceesti-mates.SinceacorrelationbiasadjustmentwasincludedintheA.C.E.RevisionIIestimates,theadjustmentforcorrela-tionbiaswasrecalculatedforeachreplicate.Analterna-tivevarianceestimationprocedureassumedthattheform ofthecorrelationbiasadjustmentwasascalartimesthedouble-samplingestimator.ThereplicationmethodalsoaccountsfortheA.C.E.blockclustersampling.SYNTHETICERROREVALUATIONTheA.C.E.RevisionIIhasseveralpotentialsourcesofsyn-theticerror.Onesourceinvolvescorrectingtheindividual post-stratumestimatesforerrorestimatesatmoreaggre-gatelevels,suchascorrectionsforcorrelationbiasandmeasurementcodingerrors.However,theevaluationof syntheticerrorfocusesonerrorinsmallareaestimation.Syntheticestimationbiasariseswhenareasinapost-stratumhavedifferentcoverageerrorrates,buthavethe samecensuscoveragecorrectionfactor.Toassesssyn-theticestimationbiasforagivenarea,anestimatebasedondatafromtheareaalone,calledadirectestimate,must bedeveloped.Suchanestimateispossibleforonlylarge areas.Inlieuofdirectestimates,syntheticestimationbiasinundercountestimatesisestimatedfromanalysisofartificialpopulationsorsurrogatevariableswhosegeo-graphicdistributionsareknown.Thesesurrogatevariablesareconstructedasbestaspossibletohavepatternssimi-lartocoverageerror.Sensitivityanalysesassessthe impactofsyntheticestimationbiasforthesevariables.Theevaluationofsyntheticerrorwithinpost-stratausesanartificialpopulationanalysissimilartothoseconducted forESCAPIandESCAPII.ThesestudiesaredocumentedinGriffinandMalec(2001,2001b).Thistime,however,theevaluationcomparestheA.C.E.RevisionIIestimatesand Census2000.Thestudyuseslossfunctionsforassessing theeffectofsyntheticerror.Themajorproductsare:*EstimatesofthebiasinthedifferencebetweencensuslossandA.C.E.RevisionIIestimatorloss.*IndicatorofwhetherthedecisiontousetheA.C.E.RevisionIIestimatorwouldhavechangedduetosyntheticerror.ERRORDUETOUSINGINMOVERSTOESTIMATEOUTMOVERSINPES-CTheerrorduetousinginmoverstoestimateoutmoversisuniquetothePES-CmodelfordualsystemestimationusedintheoriginalA.C.E.andtheA.C.E.RevisionII.For thePES-Cmodel,themembersofthePsamplearetheSectionIIChapter77-1AssessingtheEstimatesU.S.CensusBureau,Census2000 residentsofthehousingunitsonCensusDay.Thereissomedifficultyinidentifyingalltheresidentsofallthe housingunitsonCensusDaybecausesomemovepriorto theA.C.E.interview.TheA.C.E.interviewreliesonthe respondentstoidentifythosewhohavemovedout,the outmovers.Sincetheoutmoversareidentifiedbyproxies, manyoftheoutmoversarenotrecorded.Therefore,the estimateofoutmoversistoolow.Toavoidabiascaused byanunderestimateofthenumberofmovers,PES-Cuses thenumberofinmoverstoestimatethenumberofout-movers.Theinmoversarethosewhodidnotliveinthe sampleblocksonCensusDay,butmovedinpriortothe A.C.E.interview.Theoretically,thenumberofinmoversin thewholecountryshouldequalthenumberofoutmovers.
However,thenumberofinmoversmaynotequalthenum-berofoutmoversinapost-stratumbecauseofcircum-stancessuchaseconomicconditionscausingmorepeople tomoveoutofanareathantomoveintoanarea.Thefirststepofthemethodologyconsistsofrakingthenumberofoutmoverstototalinmovers.Thedistributionoftherakedoutmoversmaybetterdescribetheoutmov-ersthanthedistributionoftheinmovers.TheA.C.E.Revi-sionIIestimatesformedbyusingthenumberofinmoversarecomparedwiththeA.C.E.RevisionIIestimatescalcu-latedusingtherakednumber.ERRORFROMIMPUTATIONMODELSELECTIONThisprojectestimatestheuncertaintyduetochoiceofimputationmodelbydrawingontheanalysisofreason-ablealternativestotheimputationmodelconductedin 2001.SeeKeathleyetal.(2001)fordetails.Theidealapproachwouldbetorepeattheverytime-consuminganalysisofreasonablealternativesfortheA.C.E.Revision IIestimator.However,thisanalysiswasnotconducteddue tolimitedresources.Instead,anestimateoftheadditionalvarianceduetothechoiceofimputationmodelisdevel-opedusingthepreviousA.C.E.work.Estimatesofthevariancecomponentforcensuscoveragecorrectionfactorsthataccountforthemissingdataerrorcomponentduetotheimputationofenumerationstatus,residencystatus,matchstatus,andtheP-samplenoninter-viewadjustmentareformed.Thereplicatesusedto estimatethemissingdatavarianceareusedinthelossfunctionanalysistorepresenttherandomerrorduetothechoiceofthemodelsimputationformissingdata.EXAMININGTHEQUALITYOFTHECOMPUTERDUPLICATESWITHADMINISTRATIVERECORDSAdministrativerecordsprovideanopportunitytoexaminethequalityoftheestimatesofduplicateenumerationsusedintheA.C.E.RevisionIIestimates.ThisstudyusestheStatisticalAdministrativeRecordsSystem(StARS)2000 (Leggierietal.,2002;Judson,2000)toassesstheeffec-tivenessoftheautomatedmethodologyusedintheFur-therStudyofPersonDuplication(FSPD)toidentifydupli-cateenumerations.Secondarygoalsaretoprovidedatathatcanbeanalyzedtodeterminethenatureofthecen-susduplication,sothattheinformationmaybeusedin reducingcensusduplicationin2010andtoaidinthe evaluationofthemethodologyfortheconstructionof StARS2000.Thestudyproducesacomparisonoftheesti-matedamountofcensusduplicationbasedonadministra-tiverecordswiththeestimatefromFSPD.StARSisnewmethodologythatcompilessevenadminis-trativerecordsfiles,includingfilesfromIRS,Medicare, HUD,andSelectiveService 1.Theevaluationusesaprevi-ousmatchbetweenthecensusandStARS2000toassignanIdentification(ID)Numbertoasmanycensusrecords aspossible.TheprocessofassigningIDNumberswasbasedonnameandaddress.OnepassthroughthecensusfilesusedboththeaddressandthenametoassignID Numbers.Asecondpassusedonlythenameandbirthdate.AcensusrecordwasassignedanIDNumberonlyifitwaslinkedwithexactlyoneIDNumber.CensusenumerationswiththesameIDNumberarecon-sideredduplicates.ThemethodaccountsforcoincidentalagreementofnamesbyrequiringassignmentofIDNum-bersonlywhenexactlyoneIDNumberwaslinkedtotheenumeration.Inmostcases,twopeoplewithverysimilarnamesandcharacteristicswouldhavelinkedtoeachoth-ersIDNumberandwouldnothavebeenassigneda uniqueIDNumber.CLERICALREVIEWOFCOMPUTERDUPLICATESThestudyexaminesaccuracyoftheFSPDcomputeridenti-ficationofduplicationinthecensusbyhavingclerksreviewtheenumerationsthatthecomputerdesignatesas duplicates.Theclerksdeterminewhetherthesetsoftwo enumerationsappeartobethesamepersons.Inaddition,censusenumerationsidentifiedasduplicatesbyadminis-trativerecords,butnotbythecomputer,alsohaveacleri-calreview.Thepotentialcensusduplicatesidentifiedbyadministrativerecordsareaby-productoftheevaluationofthecomputerduplicatesusingadministrativerecords.Thereviewisrestrictedtoduplicatesbetweenenumera-tionsintheEsampleintheA.C.E.blocksandcensusenu-merationsoutsidethesearcharea.Linksbetween P-samplenonmatchesandenumerationsoutsidethesearchareaalsoarereviewed.Theclericalreviewproducesthefollowing:*NumberofE-sampleenumerationswithfalseduplicatelinksidentifiedbythecomputer.
1TheCensusBureauobtainsadministrativedataforitsStARSdatabaseasauthorizedbyTitle13U.S.C.,section6andsup-portedbyprovisionsofthePrivacyActof1974.UnderTitle13, theCensusBureauisrequiredtoprotecttheconfidentialityofalltheinformationitreceivesdirectlyfromrespondentsorindirectlyfromadministrativeagenciesandispermittedonlytousethat informationforstatisticalpurposes.7-2SectionIIChapter7AssessingtheEstimatesU.S.CensusBureau,Census2000
- NumberofE-sampleenumerationswithmisseddupli-catesidentifiedbyadministrativerecordsthatarecor-rect.*NumberofP-samplenonmatcheswithfalseduplicatelinksidentifiedbythecomputer.*NumberofP-samplenonmatcheswithmisseddupli-catesidentifiedbyadministrativerecordsthatarecor-rect.Withtheseresults,theaccuracyrateforthecomputeridentificationofduplicatesinthecensusandbetweenthe P-samplenonmatchesandthecensuscanbecomputed.AT-RISKCODINGThestudyassessestheamountoferroratriskduetonothavingeachandeverycaseintheEvaluationFollow-up(EFU)samplereviewedclerically(AdamsandKrejsa,2002).
ThedatacollectedintheEvaluationFollow-upoftheA.C.E.founderrorsinthecodingofE-samplecensusenu-merationstatusandP-Sampleresidenceandmatchstatus thatneededtobecorrectedfortheA.C.E.RevisionIIesti-mator.Ideally,thiswouldmeanrecodingtheentireA.C.E.sample,butthatwasnotpossiblebecausetheEvaluation Follow-upcollecteddatainonly2,259outofthe11,303 A.C.E.sampleclusters.Evenclericallyrecodingthe70,000casesintheEvaluationFollow-upsamplewasnotfeasiblebecauseoftimeconstraints.Anewstrategywasdevised toprovidethemosthighqualitydatainthetimeallowedbyrestrictingtheclericalreviewtothemoredifficultcases.Thisstrategyreducedtheclericalworkloadto about25,000,whichcouldbedone,andensuredthelarg-estsamplepossiblefortheA.C.E.RevisionIIestimates.SincethePersonFollow-up(PFU)andtheEvaluationFollow-up(EFU)questionnaireshadbeenkeyedandwere availableinelectronicform,datawerecombinedusinganalgorithmbasedonthekeyeddataandaclericalcodingofthecategoriesofcaseswherethecomputerdidnot appeartodoagoodjob.ThemethodcomparesthecodeassignedbasedonthePFUquestionnairetothecodeassignedbasedontheEFUquestionnaire,andthen,determinesthebestcode.The effectivenessofthecomputeralgorithmisassessedbythe agreementbetweenthetwonewcodes,andacomparisonwithrecodesassignedinthefallof2001toasubsampleoftheEFUEsamplecalledthePersonFollow-up/
EvaluationFollow-up(PFU/EFU)Review.ThePFU/EFUReviewisbelievedtohavebeenthebestA.C.E.coding operation.ForthePsampleintheEvaluationFollow-up,acodingalgorithmforthekeyeddatafromthePFUandEFUques-tionnairesalsowasdeveloped.Assessingthequalitywasnotaseasyforthenonmatchesandunresolvedcasesasforthematches.AlthoughrecodesfromthePFU/EFUReviewwereavailableforthematchesinthePsample, noneofthenonmatchesorunresolvedcaseswere
included.ThecategoriesofcasesnotsentforclericalreviewhadahighagreementratebetweenthePFUandEFUcodesassignedbythecomputeralgorithm.Forthecasesin thesecategorieswherethePFUandEFUdisagreed,theselectedcodecamefromtheformwithmoredetailedinformation.Therefore,therearethreetypesofcasesin theestimation:1.ThePFUandEFUcodesassignedbycomputeragree.2.ThePFUandEFUcodesassignedbycomputerdis-agree,butareinacategorywherethereishighcon-sistencybetweenthePFUandEFUcodes,andeitherthePFUformortheEFUformdoesnothaveanswerstoallthequestions.Thecodefortheformwithcom-pletedataisselected.3.Clericallyassignedcodes.Thefirstgroupiscalledtheat-riskcases.Thesecasesmayhaveahigherriskoferrorthantheothersbecause thelackofclericalreview,eventhoughthecodesassignedbythecomputeralgorithmagree.However,casesinthesecondgroupmayalsohaveerror,althoughtheyareina categorywithhighconsistencybetweenthePFUandEFU.Forthesecases,thereisnowaytoassesstheriskoferrorduetothelackofinformationononeoftheforms.Toassessthepotentialforerror,theat-riskcasesareassumedtohavethesameerrorrateascasesintheircat-egoryinthePFU/EFUReview.Thepotentialimpactis assessedbycomparingtheA.C.E.RevisionIIdouble-samplingadjustmentfactorswiththedouble-samplingratiosundertheassumptionthatincorporatestheerrorrates.Thedouble-samplingadjustmentfactorsare describedinChapter6.INCONSISTENCYOFPOST-STRATIFICATIONVARIABLESInconsistencyintheE-andP-samplereportingofthechar-acteristicsusedindefiningthepost-stratamaycreateabiasinthedualsystemestimate(DSE).ThisbiasaffectstheestimationoftheP-samplematchrate.Theanalysisofthepost-stratificationvariablesfortheA.C.E.RevisionIIestimatorwassimilartotheinvestigationdonefortheoriginalA.C.E.Thebasicapproachwasto estimatetheinconsistencyinthepost-stratificationvari-ablesusingthematches,thenassumethattheratesalsoheldforthenonmatches.Themodelsusedfortheincon-sistencyanalysisoftheoriginalA.C.E.post-strata, describedinHabermanandSpencer(2001),werefittedintwosteps:(1)modelsforinconsistencyofbasicvariables,SectionIIChapter77-3AssessingtheEstimatesU.S.CensusBureau,Census2000 and(2)derivationofinconsistencyprobabilitiesforpost-stratificationgiventheinconsistencyprobabilitiesofthe basicvariables.Theinconsistencyprobabilitiesledtoan estimateofthebiasintheP-samplematchratethatwas usedtoestimatethebiasintheDSE.Theapproachtaken fortheA.C.E.RevisionIIestimatoristore-calculatethe modelsin(1)and(2)toreflectrevisionsintheP-sample post-stratificationandrepeattheanalysis.Toassessthebiasduetoinconsistencyinthepost-stratificationvariables,theA.C.E.RevisionIIestimatesarecalculatedwithacorrectiontothematchrateforthe inconsistency.Estimateswithandwithoutthecorrectionarethencompared.CONSISTENCYBETWEENTHEA.C.E.REVISIONIIESTIMATORANDHUCSThestudyexaminesthevalidityoftheA.C.E.RevisionIIestimatesbyassessingtheconsistencyintheresultsfrom theA.C.E.RevisionIIestimatesandtheHousingUnitCov-erageStudy(HUCS)describedinBarrettetal.(2001).SincetheA.C.E.RevisionIIestimatescouldhavebeenusedin thepost-censalestimatesprogramthatutilizestheaver-agehouseholdsizeinmanycalculations,itisimportanttoconsidertheconsistencybetweentheA.C.E.RevisionII estimatesandtheHUCSdata.A.C.E.RevisionIIestimatescensuscoverageforpeopleandHUCSestimatescensuscoverageforhousingunits.
Patternsinthedifferentialcoveragefordemographicand geographicgroupswereexamined.Similarpatternsinthemeasuresofchangeincensuscoveragebetween1990and2000fordemographicandgeographicgroupsare expected.Ifthereisasubstantialdifferenceinthecensuscoverageerrorcausedbymissingwholehouseholdsandbymissingpeoplewithinhouseholds,thepatternsofdif-ferentialcoverageofpeopleandofhousingunitsmaynothavesimilarpatterns.IftherearedemographicorgeographicgroupswherethedifferentialcoveragefromtheA.C.E.RevisionIIestimatorandHUCSissubstantiallydifferent,thestudyattemptstodescribewhetherthedisagreementisasymptomofprob-lemswiththeA.C.E.RevisionIIestimatororHUCS,orthe resultoflegitimatedifferencesincoverage.RELATIVEACCURACYOFTHECENSUSANDA.C.E.REVISIONIIESTIMATORUSINGDEMOGRAPHICANALYSISDemographicAnalysis(DA)usesvitalrecords,immigrationstatistics,andMedicaredatatoobtainanestimateofthe populationsize.Sincethemethodsaresomewhatinde-pendentofthecensus,DAprovidesamethodforassess-ingtherelativequalityofthecensusandtheA.C.E.Revi-sionII.TheconsistencyofestimatesofdifferentialcensuscoveragefromtheA.C.E.RevisionIIestimatorandDAareassessedfordemographicgroups.Estimatesofdifferentialcensuscoveragearecomparedbydemographiccharacteristics,includingrace,sex,andage.
TheestimatesofpopulationsizebasedonDAarenot viewedwithasmuchconfidenceastheestimatesofdiffer-entialcoverage.DAdoesabetterjobofmeasuringdiffer-encesincoveragebetweengroupsthanpopulationsize.Inaddition,sexratiosfromtheA.C.E.RevisionIIestimatesandDAarecompared.Thesexratioistheratioofmales tofemalesandprovidesameasureofdifferentialcoverage ofmalesandfemales,especiallywhencalculatedforrace groups.Thesecomparisonsarerepeatedwith1990Post-EnumerationSurveyandDAestimatestoprovideacon-textforviewingthecomparisonswiththe2000data.Anassessmentisconductedtodeterminewhetherbothmeth-odsmeasurethesamechangeindifferentialnetunder-countsfrom1990to2000.RELATIVEACCURACYOFTHECENSUSANDTHEA.C.E.REVISIONIIESTIMATORUSINGCONFIDENCEINTERVALSANDLOSSFUNCTIONANALYSISTwoadditionalmethodsofassessingtherelativeaccuracyofthecensusandtheA.C.E.RevisionIIestimatesare usingconfidenceintervalsforthenetundercountrateand alossfunctionanalysis.Confidenceintervalsfornetundercountratesareformedusingestimatesofnetbiasandvariance.Sincemostofthedataavailableonthequal-ityoftheoriginalA.C.E.isbeingincorporatedintheA.C.E.RevisionIIestimates,theestimationofthenetbiasusesthedatathatwerenotincluded.Inthelossfunctionanaly-sis,themeansquarederrorweightedbythereciprocalofthecensuscountisusedtoestimatelossforlevelsandsharesforcountiesandplacesacrossthenationand withinstate.Confidenceintervalsthatincorporatethenetbiasaswellasthevarianceforthenetundercountrateprovidea methodforcomparingtherelativeaccuracyofthecensusandtheA.C.E.RevisionIIestimates.Thenetbiasinthecensuscoveragecorrectionfactorisestimatedforeach post-stratum.Withtheestimatedbiasandvarianceforeachcensuscoveragecorrectionfactor,thebiasB (U)and variance Vinthenetundercountrateareestimated.Also,95percentconfidenceintervalsforthenetunder-countrateareconstructedby (UBU2V,UBU2V).Since=0correspondstonoadjustmentofthecensus,onecomparisonoftherelativeaccuracyofthecensusandtheA.C.E.RevisionIIestimatesisbasedonanassessmentofwhethertheconfidenceintervalsfortheevaluation post-stratacover0and.7-4SectionIIChapter7AssessingtheEstimatesU.S.CensusBureau,Census2000 AlossfunctionanalysisforlevelsandsharescomparesthecensusandtheA.C.E.RevisionIIestimatorforcoun-tiesandplacesacrossthenationandwithinstate.The measureofaccuracyusedbythelossfunctionsisthe weightedmeansquarederrorwiththeweightssettothe reciprocalofthecensuscountforlevelsandthereciprocal ofcensusshareforshares.Themotivationfortheselected groupingsforthelossfunctionsistheirpotentialusein thepost-censalestimates.Thesegroupingsare:*Levels*Allcountieswithpopulationof100,000orless*Allcountieswithpopulationgreaterthan100,000
- Allplaceswithpopulationatleast25,000butlessthan50,000*Allplaceswithpopulationatleast50,000butlessthan100,000*Allplaceswithpopulationgreaterthan100,000*Shareswithinstate*Allcounties*Allplaces*ShareswithinU.S.*Allplaceswithpopulationatleast25,000butlessthan50,000*Allplaceswithpopulationatleast50,000butlessthan100,000*Allplaceswithpopulationgreaterthan100,000
- AllstatesSectionIIChapter77-5AssessingtheEstimatesU.S.CensusBureau,Census2000 SectionII.
ReferencesAdams,T.andKrejsa,E.(2001).ESCAPII:ResultsofthePersonFollowupandEvaluationFollowupFormsReview,ExecutiveSteeringCommitteeforA.C.E.PolicyII,Report 24.Adams,T.andKrejsa,E.(2002).A.C.E.RevisionIIMea-surementSubgroupDocumentation,DSSDA.C.E.Revision IIMemorandumSeries#PP-6.Adams,T.andLiu,X.(2001).ESCAPII:EvaluationofLackofBalanceandGeographicErrorsAffectingPersonEsti-mates,ExecutiveSteeringCommitteeforA.C.E.PolicyII,Report2.Barrett,D.,Beaghen,M.,Smith,D.,andBurcham,J.(2001).ESCAPII:Census2000HousingUnitCoverageStudy, ExecutiveSteeringCommitteeforA.C.E.PolicyII,Report 17.Bean,S.(2001).ESCAPII:AccuracyandCoverageEvaluationMatchingError,ExecutiveSteeringCommitteeforA.C.E.PolicyII,Report7.Cantwell,P.andChilders,D.(2001).AccuracyandCoverageEvaluationSurvey:AChangetotheImputationCellstoAddressUnresolvedResidentandEnumeration Status,DSSDCensus2000ProceduresandOperationsMemorandumSeries,#Q-44.Childers,D.(2001).AccuracyandCoverageEvaluation:TheDesignDocument,DSSDCensus2000ProceduresandOperationsMemorandumSeries,ChapterS-DT-1, Revised.Davis,P.(2001).AccuracyandCoverageEvaluation:DualSystemEstimationResults,DSSDCensus2000 ProceduresandOperationsMemorandumSeries#B-9*.ESCAPI(2001).ReportoftheExecutiveSteeringCommitteeforAccuracyandCoverageEvaluationPolicy,March1,2001.(Seewww.census.gov/dmd/www/
pdf/Escap2.pdf)ESCAPII(2001).ReportoftheExecutiveSteeringCommitteeforAccuracyandCoverageEvaluationPolicyonAdjustmentforNon-RedistrictingUses,October17,2001.(Seewww.census.gov/dmd/www/
pdf/Recommend2.pdf)Fay,R.(2001).ESCAPII:EvidenceofAdditionalErroneousEnumerationsfromthePersonDuplicationStudy,ExecutiveSteeringCommitteeforA.C.E.PolicyII, Report9,PreliminaryVersion,October26,2001.Fay,R.(2002).ESCAPII:EvidenceofAdditionalErroneousEnumerationsfromthePersonDuplicationStudy, ExecutiveSteeringCommitteeforA.C.E.PolicyII,Report9,RevisedVersion,March27,2002.Fay,R.(2002b).ProbabilisticModelsforDetectingCensusPersonDuplication,ProceedingsoftheSurveyResearchMethodsSection,AmericanStatisticalAssociation.Feldpausch,R.(2001).ESCAPII:CensusPersonDuplicationandtheCorrespondingA.C.E.Enumeration Status,ExecutiveSteeringCommitteeforA.C.E.PolicyII,Report6.Griffin,R.andMalec,D.(2001).AccuracyandCoverageEvaluation:AssessmentofSyntheticAssumption,DSSDCensus2000ProceduresandOperationsMemorandum Series,B-14*.Griffin,RandMalec,D.(2001b).ESCAPII:SensitivityAnalysisfortheAssessmentoftheSyntheticAssumption,ExecutiveSteeringCommitteeforA.C.E.PolicyII,Report23.Haberman,S.andSpencer,B.(2001).EstimationofInconsistentPost-stratificationinthe2000A.C.E.Paper preparedbyAbtAssociatesInc.andSpencerStatistics,Inc.
underTaskNumber46-YABC-7-00001,ContractNumber50-YABC-7-66020.Haines,D.(2001).AccuracyandCoverageEvaluationSurvey:ComputerSpecificationsforPersonDualSystemEstimation(U.S.)-Re-issueofQ-37,DSSDCensus2000 ProceduresandOperationsMemorandumSeries#Q-48.Hogan,H.(1993).The1990Post-EnumerationSurvey:OperationsandResults,JournaloftheAmericanStatisticalAssociation,88,1047-1060.Hogan,H.(2002).FiveChallengesinPreparingImprovedPostCensalPopulationEstimates,DSSDA.C.E.RevisionIIMemorandumSeries#PP-1.Hogan,H.,Kostanich,D.,Whitford,D.,andSingh,R.(2002).ResearchFindingsoftheAccuracyandCoverageEvaluationandCensus2000Accuracy,ProceedingsoftheSurveyResearchMethodsSection,AmericanStatistical
Association.Ikeda,M.(2001).AccuracyandCoverageEvaluationSurvey:SomeNotesRelatedtoAccuracyandCoverageEvaluationMissingDataProcedures,DSSDCensus2000 ProceduresandOperationsMemorandumSeries#Q-77.SectionIIReferences1ReferencesU.S.CensusBureau,Census2000 Ikeda,M.andMcGrath,D.(2001).AccuracyandCoverageEvaluationSurvey:SpecificationsfortheMissingData Procedures;RevisionofQ-25,DSSDCensus2000 ProceduresandOperationsMemorandumSeries#Q-62.Judson,D.(2000).TheStatisticalAdministrativeRecordsSystem:SystemDesignandChallenges,PaperpresentedattheNISS/TelcordiaDataQualityConference,November, 2000.Keathley,D.,Kearney,A.,andBell,W.(2001).ESCAPII:AnalysisofMissingDataAlternativesfortheAccuracyandCoverageEvaluation,ExecutiveSteeringCommitteefor A.C.E.PolicyII,Report12.Kostanich,D.(2003).A.C.E.RevisionII:SummaryofMethodology,DSSDA.C.E.RevisionIIMemorandumSeries
- PP-35.Krejsa,E.andAdams,T.(2002).ResultsoftheA.C.E.RevisionIIMeasurementCoding,DSSDA.C.E.RevisionII MemorandumSeries#PP-55.Krejsa,E.andRaglin,D.(2001).ESCAPII:EvaluationResultsforChangesinA.C.E.EnumerationStatus, ExecutiveSteeringCommitteeforA.C.E.PolicyII,Report3.Leggieri,C.,Pistiner,A.,andFarber,J.(2002).MethodsforConductinganAdministrativeRecordsExperimentin Census2000,ProceedingsoftheSurveyResearchMeth-odsSection,AmericanStatisticalAssociation.Mule,T.(2001).ESCAPII:PersonDuplicationinCensus2000,ExecutiveSteeringCommitteeforA.C.E.PolicyII,Report20.Mule,T.(2001b).AccuracyandCoverageEvaluation:DecompositionofDualSystemEstimateComponents,DSSDCensus2000ProceduresandOperationsMemorandumSeries#B-8*.Mule,T.(2002).RevisedPreliminaryEstimatesofNetUndercountsforSevenRace/EthnicityGroupings,DSSDA.C.E.RevisionIIMemorandumSeries#PP-2.Mule,T.(2002b).FurtherStudyofPersonDuplicationStatisticalMatchingandModelingMethodology,DSSD A.C.E.RevisionIIMemorandumSeries#PP-51.Mulry,M.andPetroni,R.(2002).ErrorProfileforPES-CasImplementedinthe2000A.C.E.,ProceedingsoftheSurveyResearchMethodsSection,AmericanStatistical
Association.Nash,F.(2000).OverviewoftheDuplicateHousingUnitOperations,Census2000InformationMemorandumNumber78.Raglin,D.andKrejsa,E.(2001).ESCAPII:EvaluationResultsforChangesinMoverandResidenceStatusintheA.C.E.,ExecutiveSteeringCommitteeforA.C.E.PolicyII,Report16.Robinson,J.G.(2001).ESCAPII:DemographicAnalysisResults,ExecutiveSteeringCommitteeforA.C.E.PolicyII, Report1.Thompson,J.,Waite,P.,Fay,R.(2001).BasisofRevisedEarlyApproximationofUndercountsReleasedOctober17, 2001,ExecutiveSteeringCommitteeforA.C.E.PolicyII,Report9a.U.S.CensusBureau(2003).TechnicalAssessmentofA.C.E.RevisionII,March12,2003.(See www.census.gov/dmd/www/pdf/ACETechAssess.pdf)Winkler,W.(1995).MatchingandRecordLinkage,BusinessSurveyMethods,ed.B.G.Coxetal.(NewYork:J.WileyandSons),355-384.Winkler,W.(1999).DocumentationforRecordLinkageSoftware,U.S.CensusBureau,StatisticalResearch Division.Yancey,W.(2002).BigMatch:AProgramforExtractingProbableMatchesfromaLargeFileforRecordLinkage,U.S.CensusBureau,StatisticalResearchDivision.2SectionIIReferencesReferencesU.S.CensusBureau,Census2000