Psychometric testing in social care

Listening to practitioners leads to effective psychometric testing, says Robyn Johnson

staff focus groupTesting the reliability and effectiveness of an assessment tool is usually the domain of academic institutions.

But in 2015, after evaluating and revising the Graded Care Profile service, which helps social workers measure the quality of care being given to a child, the NSPCC decided to carry out psychometric testing for the first time.

Our experience has shown that practitioner engagement is crucial in making sure the requirements of robust testing are met, alongside good social work practice.

What is the Graded Care Profile?

Graded Care Profile (GCP) is a tool for assessing the care and neglect of children. After the tool was evaluated in 2015 it became known as Graded Care Profile 2. 

To make sure that Graded Care Profile 2 was valid and reliable we decided to carry out psychometric testing through NSPCC service centres, with social work qualified practitioners. It provided a real-world service setting with families being assessed and, if necessary, receiving help.

It allowed us to balance the needs of good social work practice with the needs of robust testing.

Graded Care Profile 2

Assessing the care of children and identifying neglect.
Graded Care Profile service

How to engage practitioners and managers

To achieve the balance between good social work practice and robust testing, we listened to practitioners and their managers.

We carefully explained the requirements of the testing along with the rationale behind it. We invited them to come up with solutions to any issues our testing presented. 

Once they understood the testing process, practitioners were in a unique position to anticipate how the testing would work in their own practice. They could predict potential challenges, as well as possible solutions. 

This process proved invaluable. We used the practitioners' expertise and engagement to ensure the right environment for the testing, as well as making sure that normal practice could go on.

The role of inter-rate reliability testing

To check if 2 practitioners using the tool would be likely to come up with similar results, we used inter-rate reliability testing.

A quantitative definition to measure consensus between 2 or more observers.

The testing involves pairs of practitioners using Graded Care Profile 2 with the same families, so they would see and hear the same evidence.

We then compared results to see how similar they were.

A new approach inspired by good practice

Usually, to prevent influencing each other's scores when using inter-rater reliability, raters do not share their thoughts.

But practitioners normally discuss cases and reflect on family visits - and they wanted to continue this good practice.

We talked to the practitioners, managers and the team carrying out the testing so we all understood the issues. Practitioners suggested an approach in which they would carry out scoring whenever the pair agreed they had enough information: they would then be free to share their thoughts. This worked well for everyone.

Putting families first in research

child father readingUnderstandably, practitioners wanted to use Graded Care Profile 2 with all children in a family where there was concern. And if there were 2 parents, each parenting differently, they wanted to use the tool with each parent. 

But for the testing, it was best to use different families so the tool was used with cases that weren't too similar. 

Again, following discussions, we agreed that practitioners would use the tool with as many children in a family as required, submitting all the scores. 

We chose a maximum of 2 cases per family to be used for testing: where possible, 1 child and mother, and another child and father, so they were as different as possible. This also had practical benefits - fewer families needed to be recruited, so the testing process was faster.

Managers also suggested a moderation exercise once a family's scores had been submitted. Both practitioners would agree on a set of joint scores. 

These scores were used with the family and shared with the original referrer to help decide what happened next. In this way, the family received a normal service whilst taking part in the testing. 

What is needed for accurate outcomes

We needed to make sure that the same practitioners were involved in the testing throughout the process, for reliability and validity testing and to produce accurate results.

As well as being flexible with timescales, to allow for sick leave and because visits ideally needed to be carried out with both practitioners present, we were also upfront about the commitment we needed. The practitioners agreed to be committed to this work until we had all the data we needed.

By listening to practitioners, we found that we could meet the requirements of robust psychometric testing alongside good social work practice. 

Like this blog?

Let us know which blog you've read, what you think, share information you have on the topic or seek advice. 

Get in touch

More information

Impact and evidence insights

Each week we'll be posting insights from professionals about evaluation methods, issues and experiences in child abuse services and prevention.

Read our blogs

Impact and evidence

Find out how we evaluate and research the impact we’re making in protecting children, get tips and tools for researchers and access resources.
Impact and evidence

Evidence, impact and evaluation

Evaluating the impact of our Graded Care Profile service.

Learn more

Tools for measuring outcomes

We want to share our experiences of using standardised measures in our evaluations so that we can help others who are looking at evaluation methods.

Learn more