Leveraging Open Source Data Integration Tools for the Law Firm

Most firms don’t have the resources or expertise to purchase and implement an enterprise based business intelligence system but they still need tools to manage their firm’s performance. This was the case in our implementation so we turned to open source tools to consolidate the data into one operational data store for reporting. All the data is merged into one business view that serves as the source for all the firm’s reports and analysis. The reports are scheduled to run daily and weekly and are automatically delivered to the users.


  • Data stored in multiple systems to support the different practice areas
  • Current reports manually created in Excel are prone to errors during multiple copy/paste steps
  • Information not available in a timely manner

The Company has multiple practice areas. The practice areas such as Mass Tort, Medical Malpractice and Commercial were located in the Legal Files system where Personal Injury and Worker’s Comp are located in the Needles system. The Leads and Intake data are also located in a separate disparate database. Other marketing data such as Google Analytics/Adwords, social website statistics, PPC services and 3rd party data needed to be merged with the existing data to get true value for the firm.

The disparate data created challenges for users who needed to create reports. All the data needed to be exported into many tables. There were also many transcribing and calculation errors which reduced the confidence in the accuracy of the data when the management reports were created. When the errors were found, the entire process to create the reports needed to be repeated which delayed the availability of the information.

    Success Strategy

    • Hired Assign It To Us to create a data warehouse and business intelligence solution with a limited available budget
    • Merge the data from all systems using the Pentaho open source Data Integration tools
    • Designed and developed the corporate reports and analytics leveraging the existing reporting tools. Final reports are automatically refreshed.

    A solution was needed that could be delivered within the budget constraints. The solution involved deploying an entire business intelligence solution and operational data store using open source software. The Pentaho Data Integration open source tool is used to merge all the data from the disparate data sources into one central ODS (operational data store). Business rules were incorporated into the ODS to satisfy the firms reporting requirements.

    After the data warehouse is refreshed each night, reports and analytics are automatically refreshed which eliminated the weekly report development time.
    Many reports are automatically generated and emailed to uses.

    Key Management Reports

    1. New Matters Report – Track matters by intake source and practice area.

    Tracking the source of a new matter for each practice area gives the management team info on which methods are generating leads and matters for the firm. The numbers of matters are tracked by source grouping, lead/referral type, and practice area. A comparison by time (in this case Calendar year) lets the managing partner know if new matters are growing or declining. Data from this report helps the manager make better decisions on marketing and acquisition spend.


    2. Web Analytics – Track your PPC campaigns.

    Investing in a Pay-Per-Click marketing campaign can sometimes be a “black box”. You know you should be running web campaigns but costs can skyrocket if not tracked properly. The Web Analytics report helps identify the PPC campaigns that are effective and those that need adjustment. Components of the report include…

    • Weekly leads and conversions (new clients) by practice area
    • Traffic by specific blog or website domain
    • Other metrics showing web presence including, Google indexed pages, SEO stats & rankings, Facebook likes count, new Twitter followers, # of videos & views on YouTube and new blog articles added

    In this case, the data was automatically pulled from all the cloud data using API calls and stored in a central data store using the Pentaho data-integration tool.


    3. Practice Area Case Listing Report

    Collect and analyze the data about the life cycle of cases by practice area in a list format. Include information such as:

    • Open file number and case description
    • Responsible and supporting attorneys
    • Paralegals
    • Trial start and end dates
    • Court & judge
    • Fee agreements and dates received
    • Case profitability

    4. Statute of Limitation (SOL) Reports

    Info that defines the SOL dates is recorded during the intake process. The report identifies these cases that have SOL dates and examines the dates to ensure that legal proceedings are initiated before the statute of limitation date has expired.  For our use, we created 2 sets of reports based on retained and non-retained cases. Exception highlighted is added to each case which colors the row so that the reader is alerted to the status of the SOL case. Here are the groupings:

    • Upcoming cases with SOL dates within the next 30 days  – highlighted in Red
    • Upcoming cases with SOL dates between 30 & 60 days from now  – highlighted in Yellow
    • Upcoming cases with SOL dates between 60 & 90 days from now  – highlighted in Blue
    • Cases past the SOL date – highlighted in Grey
    • All other cases –  not highlighted

      5. Mass Tort Campaign Reports

      This report shows a total of Leads/Cases acquired for a Mass Tort campaign.  In this illustration, the mass tort claims are against a drug currently on the market (redacted). The total cases are grouped by source (rows) and the relevant info to convert the leads to cases are listed in the columns.


      6. Conflicts Check Report

      In our firm, paralegals used to run separate conflict check for each system and would then need to somehow merge all the outputs into one report.  This was time-consuming and there were times when names were missed during the merge process. When these errors were made, the paralegals would have to repeat the whole labor-intensive process.

      To solve this problem, we merged the data from the 4 systems, Legal Files, Elite, Needles, and Tabs3. We consolidated the names/party and file data into one dataset using the Pentaho ETL tool.  This allows the user to run the Conflicts Check report once and do a search on a specified name against all systems.  Based on the search the report lists all the related names/parties, open or closed matters and all the file related persons and their roles.


      7. Case Profitability Statement

      This report is a detailed itemized listing of costs, expenses, and fees by case.  Ledger information is summarized and compared to show the overall profitability of the case.  A section detailing timekeepers and billable hours is also listed.

      By centralizing all the data in an operational data store, firms can create powerful reports and analyses like the ones listed above. Managing partners need these tools to better track their firm’s performance and take action to maximize the firm’s profitability.

      Results and Benefits

      • Quick return on investment. Time spent by IT & report developers to create the weekly reports was virtually eliminated.
      • Data presented in a “self-serve” and push format
      • Users making better decisions based on timely information

        The open source ODS solution allows the users to retrieve the information they need, when they need it, in a format that meets their needs, and eliminates the complexity of the underlying data. This has reduced the load on IT and report developers to be involved in this process, which is the real cost saving.

        Automation of the ODS eliminated the errors that were created when the reports were manually created. The Management team was also able to get reports and analytics that were previously not available due to the difficulty of the underlying database. The functionality of the Pentaho tools and resulting ODS removed the complexity that were a barrier to creating the reports that management needed.