Skip to main content
Advertisement
Browse Subject Areas
?

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

  • Loading metrics

A data-sharing scheme that supports multi-keyword search for electronic medical records

  • Shufen Niu,

    Roles Conceptualization, Investigation, Methodology, Resources, Software, Supervision, Validation, Writing – original draft, Writing – review & editing

    Affiliation College of Computer Science and Engineering, Northwest Normal University, Lanzhou, Gansu, China

  • Wenke Liu ,

    Roles Conceptualization, Investigation, Methodology, Project administration, Resources, Software, Visualization, Writing – original draft, Writing – review & editing

    liuwenke0315@foxmail.com

    Affiliation College of Computer Science and Engineering, Northwest Normal University, Lanzhou, Gansu, China

  • Song Han,

    Roles Software, Writing – original draft

    Affiliation College of Computer Science and Engineering, Northwest Normal University, Lanzhou, Gansu, China

  • Lizhi Fang

    Roles Software, Writing – original draft

    Affiliation College of Computer Science and Engineering, Northwest Normal University, Lanzhou, Gansu, China

Abstract

As cloud storage technology develops, data sharing of cloud-based electronic medical records (EMRs) has become a hot topic in the academia and healthcare sectors. To solve the problem of secure search and sharing of EMR in cloud platforms, an EMR data-sharing scheme supporting multi-keyword search is proposed. The proposed scheme combines searchable encryption and proxy re-encryption technologies to perform keyword search and achieve secure sharing of encrypted EMR. At the same time, the scheme uses a traceable pseudo identity to protect the patient’s private information. Our scheme is proven secure based on the modified Bilinear Diffie-Hellman assumption and Quotient Decisional Bilinear Diffie-Hellman assumption under the random oracle model. The performance of our scheme is evaluated through theoretical analysis and numerical simulation.

1 Introduction

An electronic medical record (EMR) is a digital document that contains medical information about a patient; this document is stored, managed, transmitted, and reproduced with electronic devices (computers, health cards, and others) [1]. Compared to the traditional medical record in paper form, EMR has the advantages of large storage capacity, resource saving, convenient query, improved diagnosis and treatment efficiency. With the continuous development of cloud computing, EMR has been rapidly developed, widely used, and gradually improved. A growing number of institutions and individuals use EMR and upload these data to the cloud for storage. Cloud-based systems have more advantages than traditional systems. Users can store and maintain massive data quickly and enjoy high-quality data storage services formed by cloud computing [2].

As a pervasive storage platform, cloud server providers are willing to deploy their EMR storage and application services to cloud servers [3]. Since EMR involves a large amount of patient’s private information, an important task is to prevent the EMR from being leaked by unauthorized users and cloud servers [4, 5]. To ensure data security and user privacy, the data are usually stored in the form of ciphertext in the cloud server, but users encounter the problem of how to search through the ciphertext. Searchable encryption is a cryptographic primitive that has been developed in recent years to assist users when performing keyword search on the ciphertext. This type of encryption fully utilizes abundant computing resources of cloud servers to perform keyword search on the ciphertext [6, 7]. Using searchable encryption technology, users can efficiently search EMR on the cloud server [8]. As the ciphertext of EMR is encrypted with the patient’s public key, the ciphertext can only be decrypted by the patient using the private key, which causes inconvenience in EMR sharing. The re-encryption technology realizes the conversion of the ciphertext [9], which can be converted into the ciphertext that can be decrypted by other users so that the patient’s EMR can be shared.

1.1 Related works

To enable precise retrieval of encrypted data, Song et al. [10] first proposed symmetric searchable encryption (SSE) based on stream cipher. However, the key distribution of SSE is difficult, which means that they cannot be applied in many practical applications. To address this issue, Boneh et al. [11] proposed public key encryptions with keyword search scheme (PEKS) and proved its security under the random oracle model. However, a secure channel is necessary to transmit the key in this scenario. Baek et al. [12] first proposed a secure channel free proxy re-encryption with keyword search (SCF-PEKS) model. Xu et al. [13] proposed the concept of public key encryption based on fuzzy keyword search. After the server implements fuzzy keyword search on all ciphertexts, it returns the results to the receiver, and the receiver performs a more accurate keyword search on these results. As searchable encryption provides the capability to query encrypted data with a given keyword, it can be applied to the EMR to protect the patient’s private information, such as information on identity, communication, and medical history. Liu et al. [14] proposed an efficient and secure fine-grained access control scheme, which realized authorized users’ access to the EMR in cloud storage. Li et al. [15] proposed an attribute-based searchable encryption for the EMR system, which reduced the difficulty of key management in a multi-user environment and realized fine-grained access control of EMR by data owners. As certificateless public key cryptography solves the key escrow problem and avoids the use of certificates, Ma et al. [16] proposed a certificateless searchable public key encryption scheme for mobile healthcare systems. In most EMR data-sharing schemes based on searchable encryption, the EMR ciphertext is encrypted by the patient’s public key, so only the patient uses its private key to decrypt. If the patient’s condition is serious, more hospitals are needed for online consultation, and sharing of EMR becomes a problem.

The emergence of proxy re-encryption (PRE) is considered a superior solution to the aforementioned problems. In proxy re-encryption, a semi-trusted agent converts the ciphertext encrypted with the public key of the delegator Alice to the ciphertext encrypted with the public key of the delegatee Bob through the re-encryption key generated by the proxy re-encryption. Shao et al. [17] first proposed a new cryptography primitive called proxy re-encryption with keyword search (PRES) and constructed a bidirectional PRES scheme; the researchers proved the security of this scheme under the random oracle model. Guo et al. [18] proposed the definition and security model of proxy re-encryption with keyword search with a designated tester (dPRES), which can be proved secure under the standard model. Chen et al. [19] proposed the model of limited proxy re-encryption with keyword search (LPREKS) and proved its security under the mBDH assumption and q-DBDHI assumption in the random oracle model.

1.2 Our contributions

In this study, we propose an electronic medical record data-sharing scheme that supports multi-keyword search. We simplify the proposal of Chen et al. [19] and apply it to the data sharing of EMR to achieve secure storage, privacy preservation, and secure sharing of EMR. Roughly, the contributions of our scheme are described as follows:

  1. We propose a framework for cloud-based EMR sharing with security and privacy preservation for diagnosis improvements in e-Health system. The doctor generates the EMR for the patient and encrypts it using the public key of the patient. The cloud server is responsible for storing the patient’s EMR ciphertext and performs the search operation on EMR.
  2. Our scheme can achieve conditional privacy preservation, in which each EMR encrypted by a patient is mapped to a distinct pseudo identity, while a legal hospital can retrieve the real identity of a patient from any pseudo identity. When the true identity of the patient needs to be obtained, the user can send an identity-tracking request to the hospital. After the verification request is legal, the hospital returns the true identity of the patient to the user.
  3. We apply the searchable encryption to implement the secure search on the patient’s EMR. The keyword index is stored in the cloud server. When the patient or data user needs to access the patient’s EMR, the patient uses his/her private key and multi-keyword to generate a trapdoor and upload to the cloud server, then the cloud server performs the search operation.
  4. In this scheme, EMR can be obtained not only by the patient, but also by the data user, such as medical institution and insurance company. We apply the proxy re-encryption to ensure secure sharing of the patient’s EMR. When the patient wants to access his/her EMR, the patient sends the trapdoor to the cloud server. The cloud server returns EMR ciphertext to the patient. When the data user wants to obtain the patient’s EMR, an authorization request is sent to the patient. After the authorization of the patient, the cloud server generates a re-encryption key to encrypt the EMR ciphertext. After obtaining the re-encryption ciphertext, the data user decrypts it with his/her private key to obtain the patient’s EMR.

1.3 Paper organization

The rest of this paper is organized as follows. In section 2, we present some preliminaries. In section 3, we introduce the system architecture, threat model and design goals, and algorithm model of our scheme. In section 4, we provide an overview of our scheme and describe the scheme in detail. Section 5 provides the security analysis, including the achieving goals and security proof of our scheme. In section 6, we compare the proposed scheme with relevant schemes through theoretical analysis and numerical simulation. Finally, we conclude the paper in section 7.

2 Preliminaries

2.1 Bilinear map

Let G1 and G2 be two cyclic groups of a large prime q. Let g be a generator of G1. A bilinear pairing e is a function defined by e: G1 × G1G2 if the function e satisfies the following properties:

  1. Bilinearity: For any , e(ga, gb) = e(g, g)ab.
  2. Non-degeneracy: e(g, g) ≠ 1.
  3. Computablility: e(g, g) can be efficiently computed.

2.2 Hardness assumptions

Let G1 be a cyclic group of a large prime q with a generator g. The following assumptions hold in our scheme.

Definition 1. (Modified Bilinear Diffie-Hellman (mBDH) Problem) [20]. Given (g, ga, gb, gc) ∈ G1 for , the mBDH problem is to compute e(g, g)ab/c.

mBDH assumption. We say the mBDH assumption holds if no probabilistic polynomial-time algorithm can solve the mBDH problem with a non-negligible advantage.

Definition 2. (Quotient Decisional Bilinear Diffie-Hellman (QDBDH) Problem) [21]. Given (g, ga, gb)∈G1 and QG2 for , the QDBDH problem is to determine whether Q = e(g, g)a/b or not.

QDBDH assumption. We say the QDBDH assumption holds if no probabilistic polynomial-time algorithm can solve the QDBDH problem with a non-negligible advantage.

3 System model

In this section, we present an architecture for the EMR system. Moreover, we consider several threats and propose several design goals.

3.1 System architecture

As shown in Fig 1, five entities are involved in this system: patients, doctor, hospital, cloud server, and data users.

Patient. A patient is an entity who needs medical assistance. The patient first needs to register at the hospital to obtain his/her visiting token. When a patient visits a doctor for treatment, his/her health information is generated by the doctor. When the patient’s EMR is needed, he/she can access the EMR by sending the trapdoor to the cloud server. In addition, the patient calculates a pseudo-identity for himself/herself and sends it to the hospital.

Doctor. The doctor is an entity responsible for generating the EMR for the patient and uploading them to the hospital. The doctor is also responsible for encrypting the EMR with the patient’s public key and sends the ciphertext to the cloud server. When the doctor wants to obtain the patient’s historical EMR, the doctor sends a request to the patient. After receiving the EMR from the cloud server, the patient shows the EMR to the doctor.

Hospital. A hospital is an entity that is responsible for generating a visiting token with value τ for the patient and sending the token to the cloud server. The hospital is also responsible for calculating the true identity of the patient required by the data user.

Cloud server. The cloud server is an entity that takes responsibility for storing the patient’s encrypted EMR ciphertext and providing the function of searching EMR. After receiving the trapdoor from the patient, the cloud server performs the search operation on EMR. The cloud server generates the re-encryption key by interacting with data users and patients. Then, the cloud server re-encrypts the EMR ciphertext using the re-encryption key and sends the re-encryption ciphertext to the data user.

Data user. In our scheme, the data user refers to the user authorized by the patient who wants to use the patient’s EMR. For example, if a patient’s condition is complicated, multiple experts are needed for consultation, and the experts come from different hospitals. After interacting with the patient and the cloud server, the data user receives the re-encryption ciphertext sent by the cloud server. The data user can decrypt it using his/her private key.

3.2 Threat model and design goals

In this study, we consider a semi-trust server that has been widely utilized in existing work. Specifically, the server honestly searches information for the benefit of patients, but curiously learns the underlying meaning of the sender’s EMR. In addition, malicious outside attackers may intercept and analyze the information transferred in the public channel. Based on the preceding system architecture and threat model, the design goals of our scheme are as follows:

  1. Data confidentiality and integrity. Whether the EMR is stored on the hospital server or transmitted through the public channel, no entity can retrieve or modify the EMR data.
  2. Access control. The EMR data belongs to the patients who can control data access. In other words, only authorized users have the right to access the data. Simultaneously, data access activities should always be carried out with the participation and monitoring of patients and hospitals.
  3. Secure search. When the doctor wants to access the patient’s history EMR to improve diagnosis, the patient generates a trapdoor to search the EMR. During the process, only patients can generate the trapdoor. Moreover, the pseudo-identity of the patient is used in the search process, so the eavesdropper cannot deduce the real identity of the patient.
  4. Privacy preservation. As the EMR data contains privacy-sensitive information of the patient, the patient’s identity must be kept secret.

3.3 Algorithm description

The proposed scheme is composed of nine polynomial-time algorithms:

Setup(1λ) → PP: The algorithm takes a security parameter 1λ as input, and outputs the public parameters PP.

KeyGen(PP) → (pk, sk): Given the public parameters PP, the algorithm outputs a public/private key pair (pk, sk).

Enc(pkP, W, M) → C: The algorithm inputs a public key of user P, an electronic medical record M, a keyword set W = (w1, ⋯, wn), outputs an original ciphertext C.

Takes a private key of user P and a query keyword set

as input, the algorithm outputs a keyword trapdoor set

.

Given a trapdoor set and a ciphertext C, the algorithm outputs 1 if for 1 ≤ in, or 0 otherwise.

Dec1(skP, C) → M: The algorithm takes a private key of user P and an original ciphertext C, and output a record M if each input parameter is correct.

ReKeyGen(skP, skR) → rkPR: Given user P’s private key skP and user R’s private key skR, the algorithm outputs a re-encryption key rkPR. This process is performed by user P, user R and the cloud server.

ReEnc(rkPR, C) → C′: Takes a re-encryption key rkPR from user P to user R and an original ciphertext C for user P, the algorithm converts the ciphertext C to C′ for user R.

Dec(skR, C′) → M: The algorithm takes a private key of user R and a re-encryption ciphertext C′, and output a record M if each input parameter is correct.

4 EMR sharing

4.1 Overview of scheme

Without loss of generality, we assume that a patient P registers to a hospital for medical assistance, and the hospital generates a visiting token τ for the patient and sends it to the patient. Here, τ works as the authorization for the doctor to generate EMR for the patient P. Meanwhile, the patient P computes a pseudo identity IDP for himself/herself and returns it to the hospital. The hospital packs the tuple (IDP, τ) and sends it to the cloud server. After the patient P physically visits the doctor, he/she provides τ to the doctor as accordance for generating his/her EMR. We assume that the doctor generates health record M for the patient P by the interaction. To safely store the data with interoperability, the doctor extracts a keyword set W = (w1, ⋯, wn) for the EMR. Then, the doctor encrypts M and W with the patient’s public key pkP. The ciphertext C = (CM, CW) is stored in the cloud server, where CM is the ciphertext of EMR M and CW is the ciphertext of keyword set W.

When the patient P visits another doctor in a different hospital, the doctor may think it is necessary to know the patient’s history health record. The patient P can send an access request that includes keyword trapdoor to the cloud server. If the access request is valid, the cloud server sends the patient P the ciphertext CM. The patient P can decrypt CM with his/her private key to obtain the health record M. Then, the patient shows it to the doctor.

If the data user R wants to access the EMR of patient P, then he/she sends an interactive request to the patient and the cloud server. After the interaction, the cloud server generates a re-encryption key. The cloud server uses this key to re-encrypt the EMR ciphertext and obtains the re-encryption ciphertext. Then, the cloud server sent it to the data user R. The data user R uses his own private key to decrypt the re-encryption ciphertext. If the data user wants to obtain the true identity of the patient P, he/she can send a request to the hospital.

4.2 Our scheme

In this section, we introduce the details of our proposed scheme. The entities in our scheme involved at least one of the algorithms mentioned in “algorithm definition”. Roughly, our proposed scheme is composed of four main phases: initialization, data processing, search, and record retrieval.

Phase 1: Initialization.

In this phase, the system generates the public parameter PP by operating the algorithm Setup(1λ), where 1λ is the security parameter. All the patients P, doctors D, and data users R generate their private and public keys by running the algorithm KeyGen(PP).

  1. Setup(1λ): Select two bilinear groups (G1, G2) of prime order q and a bilinear map e. Pick g as a generator of G1 and set Z = e(g, g). Select four hash functions H1: G1 → {0, 1}*, H2: {0, 1}* → G1, H3: G2 → {0, 1}log2q, H4: G2 → {0, 1}*. Thus, the public parameter can be denoted as PP = {G1, G2, g, q, e, Z, H1, H2, H3, H4}.
  2. KeyGen(PP): Each patient P randomly selects a secret value as its private key skP and computes the public key pkP = gp. Each doctor D randomly chooses a secret value as its private key skD and computes the public key pkD = gd. Each data user R randomly selects a secret value as its private key skR and computes the public key pkR = gr.

When the patient P registers at the hospital, the hospital randomly selects β ∈ {0, 1}* and computes τ = g1/β. Then, the hospital sends the token τ to the patient P securely. Meanwhile, the patient randomly selects and computes S = gs. Thereafter, the patient calculates his/her pseudo identity IDP = RIDPH1(τs) where RIDP is the real identity of the patient P. The patient P returns the tuple (τ, S, IDP) to the hospital. The hospital chooses a doctor D for the patient and sends the tuple to the cloud server with the doctor.

Phase 2: Data encryption and storage.

As a patient P sees a doctor D for medical assistance, he/she shows the doctor token τ, which works as a proof of the patient’s authorization to the doctor for generating his/her EMR. After interaction with the patient P, the doctor D generates health record MG2 and extracts a keyword set W = (w1, ⋯, wn) from the record. Then, the doctor stores M in the hospital and encrypts M and W with the patient’s public key pkP by operating the algorithm Enc(pkP, W, M).

  1. Enc(pkP, W, M): The doctor randomly selects a value and computes C1 = MZk, , C3 = H4(C1), for 1 ≤ in.

The output of encryption algorithm is C = (CM, CW), where CM = (C1, C2) and CW = (t, H3(t)). Here, CM is the record ciphertext and CW is the keyword index. The doctor sends the ciphertext C and the patient’s pseudo identity IDP to the cloud server. To match the patient’s token in the cloud server, the doctor performs the following operations:

  • Randomly chooses value and computes , .

The doctor sends (α, τ′) to the cloud server. Then, the cloud server checks whether the equation H1(τ*) = H1(τ) holds or not, where . If the equality holds, the EMR ciphertext C successfully matches the token τ of the patient P. The cloud server stores the ciphertext C and IDP together.

Correctness:

Phase 3: Search.

This phase is divided into two steps: trapdoor generation and test. On another day, the patient may visit another doctor in a different hospital. During the interaction process of the doctor and the patient, the doctor may find that it is necessary to access the patient’s history record for a more accurate diagnosis. To search over the encrypted record C, the patient P needs to compute the trapdoor set for a query keyword set by invoking the algorithm Trapdoor(skP, Q).

  1. Trapdoor(skP, Q): The patient P computes .

Meanwhile, the patient P sets an effective access time tr for this request [22], and then sends a tuple (tr, Tw) to the cloud server.

The cloud server checks the validity of tr after receiving the tuple. If tr is not effective, the message is ignored. Otherwise, the cloud server performs Search to check whether the encrypted record C involves the keyword set Q. Precisely, for each wi in Q, the cloud server checks whether the equation holds or not. If the equality holds, then the cloud server sends EMR ciphertext CM to the patient P. Otherwise, it sends ⊥.

Correctness:

Phase 4: Record retrieval.

This phase involves two cases: the patient decrypts EMR and the data user decrypts EMR.

Case 1: The patient decrypts EMR.

Upon receiving EMR ciphertext CM from the cloud server, the patient P decrypts the ciphertext CM to retrieve the record M by invoking the algorithm Dec1(skP, C).

  1. Dec(skP, C): The patient P calculates .

After obtaining the EMR M, the patient P shows it to the doctor.

Correctness:

Case 2: The data user decrypts EMR.

To obtain the patient P’s EMR, the data user R first requests the patient P and cloud servers to interact with him/her. The cloud server generates the re-encryption key by running the algorithm ReKeyGen(skP, skR). More precisely, the re-encryption key is generated by the following steps:

  • The patient P randomly chooses value . Then, the patient P sends j to the cloud server and skPj to the data user R.
  • After receiving skPj from the patient P, the data user R sends skR/(skPj) to the cloud server.
  • Finally, the cloud server computes the re-encryption key rkPR = r/p.

Then, the cloud server re-encrypts the EMR ciphertext CM with rkPR to generate the re-encryption ciphertext for the data user R by running the algorithm ReEnc(rkPR, C).

  1. ReEnc(rkPR, C): The cloud server computes , , .

The cloud server sets the re-encryption ciphertext and sends it to the data user R. After receiving EMR re-encryption ciphertext from the cloud server, the data user R decrypts it to retrieve the record M by invoking the algorithm Dec2(skR, C).

  1. Dec(skR, C): The data user R calculates .

When the real identity of the patient P needs to be obtained for treatment or medical insurance purposes, the data user R sends a request to the hospital. The hospital obtains the true identity of the patient P by calculating RIDP = IDPH4(S1/β) and returns it to the data user R. In our scheme, only the hospital system knows the β value, so only the hospital can extract the real identity of the patient.

Correctness:

5 Security analysis

5.1 Achieving goals

In this section, we illustrate how the proposed scheme can effectively achieves the design goals presented in “System Model”.

The proposed scheme achieves data confidentiality and integrity. The EMR data are encrypted before being outsourced to the hospital server. The doctor uses the patient’s public key to encrypt the EMR. On the one hand, the patient uses his/her private key to decrypt the EMR ciphertext; on the other hand, the data user authorized by the patient uses his/her private key to decrypt the EMR re-encryption ciphertext.

The proposed scheme achieves access control. As mentioned in phase 4, if the data user wants to access the patient’s EMR, he/she first sends an authorization request to the patient. After the patient agrees, the cloud server generates a re-encryption key. The cloud server re-encrypts the EMR ciphertext with it to generate the re-encryption ciphertext that the data user can decrypt with his/her private key.

The proposed scheme achieves secure search. In phase 2 of our scheme, the EMR is encrypted with keyword search. In phase 3, the patient generates the trapdoor set to search his/her history health record to improve the doctor’s diagnosis of the patient. In this scenario, the keyword trapdoor contains the patient’s private key, so only the patient can generate the trapdoor and perform search on EMR.

5.2 Security proof

As the data used by the patient is similar to that of the data user, we only demonstrate the safety of data used by data users.

Theorem 1. Our scheme is IND-CKA secure in the random oracle model, if mBDH assumption holds in G1 and GT.

Proof. We assume the existence of a polynomial-time adversary A1 with non-negligible advantage ϵ(k) in attacking the privacy for keywords of our scheme, where ϵ(k) is a negligible function in the security parameter k. We construct a simulator B that can compute the solution of the mBDH problem.

Let (g, gα, gβ, gγG1) be an instance of the mBDH problem, where g is the generator of G1 and are uniformly random choices. The goal of B is to output e(g, g)αβ/γG2 by interacting with A1 as follows:

H1 query: B maintains an empty-initial table . Input w in the hash function H1, and B checks . If <wi, hi, ai, ci> exists in , then B returns H1(wi) = hi. Otherwise, B generates a random coin ci such that pr[ci = 0] = 1/(qT + 1), where ci ∈ {0, 1}, qT is the maximum number of Trapdoor queries. B selects a random number . If ci = 0, B returns to A1; if ci = 1, B returns to A1. Thereafter, B adds <wi, hi, ai, ci> to .

H2 query: B maintains an empty-initial table . Upon receiving H2 query about t′ ∈ G2 from A1, B checks . If t′ already exists in , B returns V to A1. Otherwise, B selects a value V ∈ {0, 1}log2 q randomly and adds <t′, V> to by setting .

Phase 1. A1 makes several queries.

Uncorrupted key query: On input an index i, B selects randomly and outputs the public key . Thus, the private key is defined as ski = γxi implicitly. B adds <i, pki, xi> to LU.

Corrupted key query: On input an index i, B selects randomly and outputs the public key . Thus, the private key is defined as ski = xi implicitly. B adds <i, pki, xi> to LU.

Trapdoor query: When A1 makes a trapdoor query on the keyword wi, B responds as follows:

  • B recovers <wi, hi, ai, ci>,<i, pki, xi>,<i, pki, xi> from , LU, LC, respectively.
  • If ci = 0, B aborts. Otherwise, B computes when ci = 1.
  • If iLC, B computes ; if iLC, B computes . Then, Ti is the trapdoor for keyword wi and B returns Ti to A1.

Re-encryption key query: When A1 asks B about the re-encryption key rkij for two public keys pki, pkj, B responds as follows:

  • If neither pki nor pkj belongs to LC, B aborts.
  • Otherwise, B returns rkij = xj/xi to A1.

Challenge: Eventually, A1 issues a challenge on two keywords w0, w1, a message m, and a public key pki. If pki belongs to LC, then B aborts. Otherwise, B performs as follows:

  • B conducts two H1 queries to obtain h0, h1G1 such that H1(w0) = h0, H1(w1) = h1. If both c0 = 1 and c1 = 1 hold, then B aborts.
  • Otherwise, at least one of c0 and c1 is equal to 0. Then B randomly picks b ∈ {0, 1} so that cb = 0.
  • B returns to A1, where V ∈ {0, 1}log2 q.
  • B implicitly defines and .

Phase 2. A1 can continue to issue several queries as in phase 1 on keyword wi, where wiw0 and wiw1.

Guess: Finally, A1 outputs its guess b′ ∈ {0, 1} to check whether the challenge ciphertext is the result of keyword w0 or w1. Then B chooses the pair (t′, V) from and outputs as its guess to e(g, g)βα/γ.

Theorem 2. Our scheme is IND-CPA secure in the random oracle model, if QDBDH assumption holds in G1 and GT.

Proof. We assume the existence of a polynomial-time adversary A2 with non-negligible advantage ϵ(k) in attacking our scheme, where ϵ(k) is a negligible function in the security parameter k. We construct a simulator B that can compute the solution of the QDBDH problem.

Let (g, ga, gbG1) be an instance of the mBDH problem, where g is the generator of G1 and are uniformly random choices. The goal of B is to output e(g, g)a/bG2 by interacting with A2 as follows:

H1 query: B maintains an empty-initial table . Once receiving H1 query about from A2, B checks . If w already exists in , B returns h to A2. Otherwise, B selects a value h ∈ {0, 1}log2 q randomly and adds <w, h> to by setting H1(w) = h.

H2 query: B maintains an empty-initial table . Once receiving H2 query about t′ ∈ G2 from A2, B checks . If t′ already exists in , B returns V to A2. Otherwise, B selects a value V ∈ {0, 1}log2 q randomly and adds <t′, V> to by setting .

Phase 1. A2 makes several queries.

Public key query: B generates a random coin c ∈ {0, 1}. If ci = 1, B selects a random value and outputs the public key . Otherwise, B outputs the public key and adds <ci, pki, xi> to table LC, where the private key is implicitly defined as ski = xi.

Private key query: B recovers <ci, pki, xi> from LC. If ci = 0, the private key skd = xd is returned to A2. Otherwise, it aborts.

Re-encryption key query: The adversary A2 can adaptively ask B for the re-encryption key rkij for any two public keys pki, pkj and B generates the re-encryption key as follows:

  • If ci = 1 and cj = 1, B aborts.
  • Otherwise, B responds rkij = xj/xi to A2.

Re-encryption query: Based on the result of re-encryption query, B obtains the re-encryption ciphertext through the re-encryption algorithm and returns it to A2.

Decryption query: After obtaining the re-encryption ciphertext , B recovers <ci, pki, xi> associated with the data user from LC. If ci = 0, B set the message . Otherwise, B sets the message .

Challenge: Eventually, A2 issues a challenge on two messages m0, m1 and a public key pki. B recovers the tuple <ci, pki, xi> from LC. If c = 1, then B reports failure and aborts. Otherwise, B randomly selects δ ∈ {0, 1} and sets the challenge ciphertext as follows:

Phase 2. A2 can continue to issue several queries as in phase 1 on message mi, where mim0 and mim1.

Guess: Finally, A2 outputs its guess b′ ∈ {0, 1} to check whether the challenge ciphertext is the result of message m0 or m1. If cδ = 1, then the ciphertext is a QDBDH instance.

6 Performance analysis

In this section, we expound a theoretical analysis on the performance of the proposed schemes. Then, we analyze the efficiency of the scheme by numerical simulation. To show the performance more intuitively, we have implemented our scheme, as well as the schemes used by Wu [23] and Wang [24] in the Linux operating system using Pairing-Based Cryptography (PBC) Library [25], programmed in C language, and ran in a virtual machine of a PC (HP PC, 3.1 GHz CPU, and 4 GB RAM). In the experiment, we used elliptical curves with a base field size of 512 bits and an embedding degree of 2. The security levels are selected as |p| = 512.

6.1 Theoretical analysis

In this section, we compare the computation overhead of the proposed scheme and other schemes from a theoretical perspective. We denote Te, Tp, Th, TH, Tmul as the computation cost of exponentiation operation, bilinear pairing operation, general hash function, hash-to-point operation, and multiplication operation, respectively. The running time of those basic operations are presented in Table 1.

As shown in Table 1, TH and Tmul are much smaller than the others, so the hash-to-point operation time and multiplication operation time are negligible. The descending order time of common cryptographic algorithms is Te, Tp, Th, TH, Tmul, and the computational cost of the bilinear pairing operation is much higher than that in other cryptographic algorithms. The computation cost of the proposed schemes in the index generation and search phases is presented in Table 2. We specify n as the number of keywords.

As shown in Table 2, in the index generation phase, the descending order of the computation cost is Wang’s scheme [24], our scheme, and Wu’s scheme [23]. In the search phase, the descending order of the computation cost is Wang’s scheme [24], our scheme, and Wu’s scheme [23]. Since Wu’s scheme [23] only implements single-keyword encryption and our scheme implements multi-keyword encryption, the computation cost of our scheme in the index generation and search phase is higher than that of Wu’s scheme [23].

6.2 Numerical simulation

We compared our scheme with the schemes proposed by Wu [23] and Wang [24] through numerical simulation. Both our scheme and Wang’s scheme [24] realize multi-keyword search function in ciphertext, whereas Wu’s scheme [23] only realizes single-keyword search. In the numerical simulation, we use the same number of keywords in the index generation and search phases, and compare the computational overhead of different keyword quantities in each phase. We specify the number of keywords as n = 100, 200, 300, 400, 500, 600, 700, 800, 900, 1000. The experimental result is the average time for the algorithm to run 10 times. For more information, see S1 Appendix and S1 File.

As illustrated in Fig 2, index generation time increases with the number of keywords. The index generation time of our scheme is less than that of Wang’s scheme [24] but higher than that of Wu’s scheme [23]. The reason is that our scheme uses bilinear pairing operations in the keyword encryption process, but Wu’s scheme [23] is not used. In Fig 3, we present the time cost of the search phase in all schemes. The time spent linearly increases with the number of keywords. Wu’s scheme [23] and our scheme have a subtle difference in the search phase, and both are higher than Wang’s scheme [24].

7 Conclusion

We presented an EMR data sharing scheme with privacy protection, secure storage, and secure sharing based on searchable encryption and proxy re-encryption technology, which solves the security problems of data security and personal privacy in the process of EMR sharing based on cloud storage. While protecting the privacy of the patient, this scheme enables patients to access their own EMR. After authorization is provided by the patient, the data users can also access the EMR, which is a practical approach. The EMR ciphertext and keyword index are stored in the cloud server to enable the patient to search EMR with keyword search. The cloud server generates a re-encryption key for the data user after the patient authorizes the data user to access his/her EMR. Then, the cloud server re-encrypts the EMR ciphertext with the re-encryption key and sends it to the data user, who can decrypt it using the private key.

Supporting information

S1 Appendix. Data used to build graphs.

The experimental data used for plotting in Figs 2 and 3.

https://doi.org/10.1371/journal.pone.0244979.s001

(DOCX)

S1 File. Procedure source code.

The procedure source code for the numerical simulation of our scheme, Wu’s scheme and Wang’s scheme.

https://doi.org/10.1371/journal.pone.0244979.s002

(ZIP)

References

  1. 1. Au MH, Yuen TH, Liu JK. A general framework for secure sharing of personal health records in cloud system. Journal of Computer and System Sciences, 2017, 90: 46–62.
  2. 2. Xia Z, Wang X, Zhang L. A privacy-preserving and copy-deterrence content-based image retrieval scheme in cloud computing. IEEE Transactions on Information Forensics and Security, 2017, 11(11):2594–2608.
  3. 3. Wu CH, Chiu RK, Yeh HM. Implementation of a cloud-based electronic medical record exchange system in compliance with the integrating healthcare enterprise’s cross-enterprise document sharing integration profile. International Journal of Medical Informatics, 2017, 107: 30–39. pmid:29029689
  4. 4. Chenthara S, Ahmed K, Wang H, Whittaker F. Security and privacy-preserving challenges of e-Health solutions in cloud computing. IEEE Access, 2019, 7(99):74361–74382.
  5. 5. Vimalachandran P, Zhang Y, Cao J. Preserving data privacy and security in Australian my health record system: A quality health care implication. In: International Conference on Web Information Systems Engineering. Springer, Cham, 2018: 111-120.
  6. 6. Fu Z, Huang F, Sun X, Vasilakos A, Yang C. Enabling semantic search based on conceptual graphs over encrypted outsourced data. IEEE Transactions on Services Computing, 2016, 99:1–1.
  7. 7. Xia Z, Wang X, Sun X, Wang Q. A secure and dynamic multi-keyword ranked search scheme over encrypted cloud data. IEEE Transactions on Parallel and Distributed Systems, 2016, 27(2):340–352.
  8. 8. Sun J, Wang X, Wang S, Ren L. A searchable personal health records framework with fine-grained access control in cloud-fog computing. Plos One, 2018, 13(11):e0207543 pmid:30496194
  9. 9. Shi Y, Liu J, Han Z, Zheng Q. Attribute-based proxy re-encryption with keyword search. Plos One, 2014, 9(12):e116325. pmid:25549257
  10. 10. Song D X, Wagner D, Perrig A. Practical techniques for searches on encrypted data. In: Proceeding 2000 IEEE Symposium on Security and Privacy. IEEE, 2000: 44-55.
  11. 11. Boneh D, Di Crescenzo G, Ostrovsky R, Persiano G. Public key encryption with keyword search. In: International Conference on the Theory and Applications of Cryptographic Techniques. Springer, Berlin, Heidelberg, 2004: 506-522.
  12. 12. Baek J, Safavi-Naini R, Susilo W. Public key encryption with keyword search revisited. In: International Conference on Computational Science and Its Applications. Springer, Berlin, Heidelberg, 2008: 1249-1259.
  13. 13. Xu P, Jin H, Wu Q. Public-key encryption with fuzzy keyword search: A provably secure scheme under keyword guessing attack. IEEE Transactions on Computers, 2012, 62(11): 2266–2277.
  14. 14. Liu X, Xia Y, Yang W, Yang F. Secure and efficient querying over personal health records in cloud computing. Neurocomputing, 2018, 274: 99–105.
  15. 15. Li X, Song Z, Ren J, Xu L. Attribute-based searchable encryption of electronic medical records in cloud computing. Chinese Computer Science, 2017, 44(Z11): 342–347.
  16. 16. Ma M, He D, Khan MK, Chen J. Certificateless searchable public key encryption scheme for mobile healthcare system. Computers and Electrical Engineering, 2018, 65: 413–424.
  17. 17. Shao J, Cao Z, Liang X, Lin H. Proxy re-encryption with keyword search. Information Sciences, 2010, 180(13): 2576–2587.
  18. 18. Guo L, Lu B. Efficient proxy re-encryption with keyword search scheme. Chinese Journal of Computer Research and Development, 2014, 51(6): 1221–1228.
  19. 19. Chen Z, Li S, Guo Y, Wang Y, Chu Y. A limited proxy re-encryption with keyword search for data access control in cloud computing. In: International Conference on Network and System Security. Springer, Cham, 2015: 82-95.
  20. 20. Sahai A, Waters B. Fuzzy identity-based encryption. In: Annual International Conference on the Theory and Applications of Cryptographic Techniques. Springer, Berlin, Heidelberg, 2005: 457-473.
  21. 21. Ateniese G, Fu K, Green M, Hohenberger S. Improved proxy re-encryption schemes with applications to secure distributed storage. ACM Transactions on Information and System Security, 2006, 9(1): 1–30.
  22. 22. Wang X, Zhang A, Xie X, Ye X. Secure-aware and privacy-preserving electronic health record searching in cloud environment. International Journal of Communication Systems, 2019, 32(8):e3925.1–e3925.11.
  23. 23. Wu Y, Lu X, Su J, Chen P. An efficient searchable encryption against keyword guessing attacks for sharable electronic medical records in cloud-based system. Journal of Medical Systems, 2016, 40(12): 258. pmid:27722976
  24. 24. Wang T, Au M H, Wu W. An efficient secure channel free searchable encryption scheme with multiple keywords. In: International Conference on Network and System Security. Springer, Cham, 2016: 251-265.
  25. 25. The pairing-based cryptography library. Available from: http://crypto.stanford.edu/pbc/.